Partly Cloudy With a Chance of Lava Flows: Forecasting Volcanic Eruptions in the Twenty‐First Century

A primary goal of volcanology is forecasting hazardous eruptive activity. Despite much progress over the last century, however, volcanoes still erupt with no detected precursors, lives and livelihoods are lost to eruptive activity, and forecasting the onsets of eruptions remains fraught with uncertainty. Long‐term forecasts are generally derived from the geological and historical records, from which recurrence intervals and styles of activity can be inferred, while shorter‐term forecasts are derived from patterns in monitoring data. Information from geology and monitoring data can be evaluated and combined using statistical analysis, expert elicitation, and conceptual and or mathematical models. Integrative frameworks, such as event trees, combine this diversity of information to produce probabilistic forecasts that can inform the style and scale of the societal response to a potential future eruption. Several developments show promise to revolutionize the utility and accuracy of these forecasts. These include growth in the quantity and quality of multidisciplinary monitoring data, coupled with increases in computing power; machine learning algorithms, which will allow far better utilization of this growing volume of data; and new physiochemical volcano models and data assimilation algorithms, which take advantage of a wide range of monitoring data and realistic physics to better predict the evolution of a given physical state. Although eruption forecasts may never be as generally reliable as weather forecasts, and great caution must be exercised when attempting to predict highly complex volcanic behavior, these and other innovations—particularly when combined in integrative, fully probabilistic forecasting frameworks—should help volcanologists to better issue warnings of volcanic activity on societally relevant time frames.

Plain Language Summary Unlike many natural hazards, volcanoes often give warning signs from minutes to even years before they erupt. Detecting this activity, interpreting it, and using it to accurately forecast likely outcomes, however, remains a key challenge for volcanologists. A new generation of ground-and space-based sensors is recording volcanic unrest in unprecedented spatial and temporal detail, providing new views of eruption precursors that might not have been detected just a few years ago. Current forecasts are based largely on pattern recognition and analogy, but recent advances in volcano modeling offer hope that eruption forecasting may be augmented by knowledge of the fundamental physics governing eruptive processes. The growing availability of data will also be increasingly utilized by machine learning approaches that detect patterns and inform forecasts in ways not currently possible. Insights from these new techniques can be combined using probabilistic frameworks with the eruptive history of a volcano, changes in monitoring data, expert opinion, and other sources of information to yield forecasts of volcanic activity that might begin to resemble weather forecasts. Such an advance would be of obvious societal benefit in a world where millions of people live in the shadows of active volcanoes.

Forecasting Volcanic Activity: A Tractable Problem for Science and Society
"Present trends indicate that the accuracy of forecasting is improving, and there is reasonable hope that unless scientists promise-or the public expects-too much too soon, scientific forecasting of natural events such as volcanic eruptions will become a valuable and respected endeavor."-- Decker (1986) A fundamental goal of volcanology is to accurately forecast eruptions and their hazards as a means of mitigating their impact on society. Worldwide, about 30 million people live within 10 km of a Holocene volcano (Brown, Auker, et al., 2015). Historical eruptions have claimed tens of thousands of lives from direct effects (Auker et al., 2013;Brown et al., 2017) and countless more from secondary impacts, like changes in regional 5. in late April 2018, it was clear that a new dike intrusion or lava breakout at Kīlauea Volcano, Hawai'i, was likely, but the magnitude of that change to the volcano's 35-year-long eruption and the resulting devastation to the surrounding community (over 700 structures destroyed) were not anticipated (Neal et al., 2019).
These examples highlight the challenge and importance of forecasting not only the onset of volcanic activity but also its style/magnitude and duration-information upon which evacuation strategies are commonly based (e.g., Papale & Marzocchi, 2019;Wolpert et al., 2016). At present, however, volcanologists have found it most feasible to forecast either the onset of eruptions at closed-system volcanoes that have repose periods of decades or longer (e.g., Cameron et al., 2018;Pesicek et al., 2018), or repeating eruptive behavior over much shorter time scales (e.g., Blake & Cortés, 2018;Connor et al., 2003;Jenkins et al., 2019;Kamo & Ishihara, 1989;Ripepe et al., 2018;Swanson et al., 1983). This is largely due to the fact that dormant volcanoes are most likely to show signs of "waking up," and that most forecasting efforts are rooted in pattern recognition, where patterns are thought to record specific subsurface processes such as magma ascent (e.g., White & McCausland, 2019). Most current forecasting efforts rely heavily upon extrapolating patterns using statistical models and expert opinion (e.g., Aspinall & Cooke, 1998, 2013Marzocchi et al., 2008Newhall & Pallister, 2015;Selva et al., 2012). The increasing availability of monitoring data at many volcanoes around the world coupled with growing computational power, however, opens new possibilities into forecasts which utilize machine learning algorithms, as well as models of the physics and chemistry of magma and host rocks (e.g., Segall, 2013). Such models may inform probabilistic assessments of volcanic activity that bear some similarities to weather forecasts (Bauer et al., 2015)-a development that would make eruption forecasts more useful to both emergency managers and the public.
In this work, we discuss the spectrum of methods used to forecast volcanic activity and advocate for an integrated approach that combines existing methods, like geological studies, global databases, and expert opinion, with emerging methods, including machine learning and physics-based (physicochemical) models. The aim of this approach is to develop probabilistic forecasts of the onset, evolution, and cessation of volcanic eruptions that do not push any one method beyond its realistic limits. We do not attempt a comprehensive review of the stunning depth and breadth of the volcano forecasting literature, but instead cite some representative work and build on numerous past reviews that include both general discussions of the field (e.g., Decker, 1973Decker, , 1986Marzocchi & Bebbington, 2012;Sparks, 2003) and discipline-specific analyses (e.g., McNutt, 1996;Segall, 2013;White & McCausland, 2019). Reading these past works in chronological order provides an interesting perspective on the evolution of volcanic eruption forecasting over time, to which we aim to contribute.

Defining the Problem: Probabilistic Forecasting
"It is this connotation of utility but uncertainty that has become established for the word forecasting that makes it my choice."-- Decker (1986) As with previous authors (e.g., Decker, 1986;Swanson et al., 1983), use of the word "forecasting" is intentional. Although the meaning can vary greatly by discipline, we define a prediction as a specific, definitive statement about the timing, location, and style/magnitude of a future outcome, even though there may be some relatively small uncertainty in its exact timing (e.g., Swanson et al., 1983). Some of the most successful and bestdocumented eruption predictions were for the onset of lava extrusion during repeating cycles of dome growth at Mount St. Helens in 1980-1986(Swanson et al., 1983; at Soufriére Hills Volcano, Montserrat, during the 1990s (Voight et al., 1998(Voight et al., , 1999; and for ongoing explosive eruptions at Mount Etna, Italy (Ripepe et al., 2018). (These sorts of short-term predictions can be thought of as an early warning system for eruptions.) In contrast to predictions, forecasts are fundamentally probabilistic and attempt to include the uncertainties that are a part of observing and interpreting all natural processes (e.g., Marzocchi & Bebbington, 2012; National Academy of Sciences, Engineering, and Medicine, 2017); they do not involve definitive statements of future events. Forecast probabilities are ideally as quantitative as possible, although a large degree of qualitative uncertainty is often also necessary (Decker, 1986). The uncertainties associated with forecasts may be frustrating, but also make them more valuable than predictions that do not quantify uncertainty or the probabilities of alternative outcomes (e.g., Rouwet et al., 2017). We turn on the news not for weather predictions but for weather forecasts -probabilistic assessments of future conditions.
Because forecasts are inherently probabilistic, a few words about probabilities are in order. The probability of some future volcanic event can be thought of in terms of either (1) its expected frequency of occurrence, often based on some kind of stochastic (i.e., randomly distributed) model derived from past events, or (2) a subjective degree-of-belief (see Marzocchi and Bebbington (2012) for a discussion of probabilities in the context of eruption forecasting, and Jordan (2014, 2017) for related probabilistic forecasts in seismology). These are often termed objective and subjective probabilities, respectively, although even socalled objective probabilities can rely on subjectively chosen statistical models. The two approaches are closely related but in many ways reflect a fundamentally different view of probability. We do not explicitly distinguish between these points of view in this work; both are useful.
Forecasts also often distinguish between aleatoric uncertainties and epistemic uncertainties. Aleatoric uncertainties are stochastic and due to fundamental nonlinearity and "randomness" in a system, whereas epistemic uncertainties reflect our limited knowledge and can therefore in principal be reduced (e.g., Marzocchi et al., 2004). Such distinctions can be useful but are also necessarily somewhat philosophical and we do not discuss them in detail in this work.
Finally, we can also distinguish between conditional probabilities (e.g., Connor et al., 2015) associated with specific hazards-for instance, the geographical extent of a lava flow or lahar under the assumption that such an event will occur (and possibly also under assumptions about its size)-and total probabilities, which also include the uncertainty in whether or not the volcano will actually produce the hazard. Conditional probabilities involve fewer unknowns and often deal with phenomena for which the governing physics are better understood and modeled; estimating conditional probabilities is therefore a more tractable problem than estimating total probabilities, but the results are also generally of less societal usefulness. There is a wealth of literature on conditional probabilities associated with various volcanic hazards, and even a number of operational tools in use, for example, for lava flow inundation (e.g., Dietterich et al., 2017;Favalli et al., 2005) and ash fall (Schwaiger et al., 2012). In this work, however, we will not focus on forecasting specific hazards but rather the eruptions from which these hazards are derived.

Long-Term Forecasts Based on Geological and Historical Records
"We didn't know whether or when Pinatubo would erupt, but we did know that if it erupted, the eruption would almost certainly be big and serious."-- Punongbayan et al. (1996) As Charles Lyell famously stated, in geology "the present is the key to the past," which in our context can reasonably be expanded to "the present is the key to the past is the key to the future." With respect to volcanic hazards, this means that the historical and geological records of eruptions at a given volcano can be extrapolated to evaluate the potential likelihood and character of future eruptions (Decker, 1986). This is the underlying basis for most volcanic hazards assessments, like those conducted by the U.S. Geological Survey (e.g., Hoblitt et al., 1998), as well as a major component of classifying volcanic risk (e.g., Ewert et al., 2005Ewert et al., , 2018. Volcanic hazards assessments, like seismic hazards assessments, can often be viewed as a kind of forecast (Field et al., 2013). For example, an assessment that the town of Orting, Washington, is susceptible to future lahars from Mount Rainier-based on mapping of past events-includes some implicit evaluation of probability and can be viewed as a long-term forecast that can be used for land-use planning and the development of monitoring networks. This assessment should be valid over long time periods to the extent that future activity follows a pattern similar to past activity. If Mount Rainier were to enter a period of unrest, a short-to intermediate-term forecast would become necessary to reflect the increased probability of a lahar in the coming days to months. In this work we do not explicitly distinguish between forecasts and volcanic hazards assessments.
Forecasts of future volcanic activity based on the geological record have shown remarkable prescience. The most famous example may be that of Mount St. Helens. Based on geological mapping of past eruptive deposits (Crandell et al., 1975), Crandell and Mullineaux wrote in 1978 that "In the future Mount St. Helens probably will erupt violently and intermittently just as it has in the recent geologic past, and these future eruptions will affect human life and health, property, agriculture, and general economic welfare over a broad area … The volcano's behavior pattern suggests that the current quiet interval will not last as long as 1,000 years; instead, an eruption is more likely to occur within the next 100 years, and perhaps even before the end of this century."-- Crandell and Mullineaux (1978) That the volcano sprang to life and erupted catastrophically just two years later is a testament to the importance of recognizing a volcano's potential behavior based on past activity. Another profound example is the 1991 eruption of Pinatubo. Prior to the awakening of the volcano in March of that year, little was known about the style and recurrence intervals of previous eruptions. A hurried program of geological mapping after the onset of unrest revealed that past Pinatubo eruptions were often large, affected all sectors of the volcano, and occurred every 500-1,000 years, and that it had been 500 years since the last eruption (Punongbayan et al., 1996). Even with only a cursory geological history, volcanologists were able to say that if Pinatubo were to erupt, the eruption would likely be large. This information became the basis for hazards assessment and mitigation efforts that, when combined with monitoring data, ultimately saved thousands of lives.
The geological record is never complete, however. Underrecording of past eruptions (Houghton et al., 2011;Rougier et al., 2018;Siebert et al., 2010) may misrepresent the frequency and style of potential future activity, while very large eruptions may not have occurred frequently enough that useful recurrence statistics can be computed (e.g., Decker, 1986) or may have not yet been recognized in the geological record Coles & Sparks, 2006;Deligne et al., 2010;Kiyosugi et al., 2015). Even the historical record, which can provide important input for models of future behavior (Woo, 2018), is probably biased toward underreporting (e.g., Moran et al., 2011). These limitations highlight potential pitfalls in relying on geological and historical data for forecasting the frequency and style of future eruptive events and the outcomes of unrest.
Although the geological and historical records provide a general sense of a volcano's activity and may yield insights into changes in eruptive behavior over time based on deviations from past patterns, they are certainly not in themselves quantitative predictive tools (Decker, 1986). Extrapolating these records to quantitatively infer possible future outcomes requires statistical methods (e.g., Pyle, 1998). Volcanic events can be treated as a stochastic process (e.g., Bebbington & Lai, 1996;Woo, 2018), and recurrence rates can be estimated by the number of eruptions that have occurred during a given interval. If eruptions are believed to occur randomly at a rate that is independent of the time since the previous event, they can be modeled as a simple Poisson process (e.g., De la Cruz-Reyna, 1991;Klein, 1982;Marzocchi & Bebbington, 2012) that can then be used to ask questions about the probabilities of future occurrences. More complex statistical distributions (e.g., Bebbington & Lai, 1996;Nathenson et al., 2012) can be used when the data suggest that the probability of an eruption changes, for instance, over time. Evidence suggests that some eruptions may be "time-predictable" (i.e., the timing of a subsequent eruption is a function of the size of the preceding eruption) or "size-predictable"/"volume-predictable" (i.e., the size of a subsequent eruption is a function of the preceding repose time; e.g., Koyama & Yoshida, 1994). Using global eruption catalogs, Marzocchi and Zaccarelli (2006) found that open-conduit volcanoes (defined as those with short interevent times) generally followed the time-predictable model, whereas eruption timing and size at closed-conduit volcanoes (with longer interevent times) followed a random distribution (although the largest eruptions are generally preceded by long repose times; Siebert et al., 2010).
Forecasts should include not only an assessment of timing but also of location. The spatial probabilities of future vent locations can be derived, for instance, from the locations and ages of past eruptive vents and can be refined by including contextual geological information (e.g., Connor et al., 2000;Bevilacqua et al., 2015Bevilacqua et al., , 2017. Vent location probabilities can be considered forecasts which are conditional upon the volcano producing a vent at some location. Of course, eruptions do not necessarily follow these patterns-as an example, the 1538 eruption of Monte Nuovo in Campi Flegrei caldera, Italy, occurred in an area of elevated probability for vent opening (based on the current phase of volcanism; Selva et al., 2012) but well outside the area of highest probability ( Figure 3; Bevilacqua et al., 2015). Vent-opening maps that utilize physics-based models (see section 5.2) of magma transport together with knowledge of the stress history of the volcano, however, do accurately anticipate the location of Monte Nuovo and offer a new and exciting means of forecasting vent locations (Rivalta et al., 2019). Although a purely spatial probability map can offer critical information (for example, when assessing the hazard to a long-term radioactive waste repository; Connor & Hill, 1995), temporal probability must also be evaluated to maximize the utility of the spatial information.
Historical and geological records of volcanic activity are perhaps most useful when consolidated into global catalogs or databases (e.g., Crosweller et al., 2012;Siebert et al., 2010;Venezky & Newhall, 2007). These databases make information more accessible, allow the construction of statistical relations (especially important for volcanoes where there is little local information; Ogburn et al., 2016), and provide a basis upon which to prioritize monitoring and guide decision making. For example, Ogburn et al. (2015) developed a database of dome-building eruptions that revealed general patterns linking extrusion rate, magma composition, and explosivity, which was subsequently utilized by Wolpert et al. (2016) to forecast the remaining durations of ongoing dome-building eruptions-vital information for managing evacuations and other longterm planning.
Mauna Loa, Hawai'i, provides a poignant example of the advantages, challenges, and insights that can result from statistical explorations of eruptive activity based on the geological and historical record. Thomas Jaggar (Figure 1) was not at all shy about making pronouncements concerning potential activity at Mauna Loa. In 1912, based on what was at the time the longest known historical repose interval at the volcano (eight years) and a historical pattern of alternating summit and flank eruption sites, he stated that a summit eruption would occur by 1 February 1915, followed within five years by an eruption at high elevation on the north flank. A summit eruption indeed began on 25 November 1914, and a flank eruption commenced about 1.5 years later, although the vent opened on the south flank at a lower elevation than Jaggar expected (Decker et al., 1995). Since Jaggar's time, Mauna Loa activity has changed. Although the distribution of all historical recurrence times is statistically random, the longest intervals between eruptions occurred after 1950 (Decker et al., 1995). By not accounting for changes in behavior such as this, statistical approaches will be biased. After Mauna Loa's 1975 eruption, Lockwood et al. (1976) forecast another eruption by 1978 on the volcano's Northeast Rift Zone (based on past eruptive sequences). The location was accurate, but the timing was off-the eruption occurred in 1984. Following that eruption, Decker et al. (1995) forecast that the next eruption would occur by 2007, attempting to account for the longer duration between eruptions. As of 2019, we are still waiting for that eruption.

Short-and Intermediate-Term Forecasts Based on Patterns in Monitoring Data
"An anecdote about forecasting volcanic activity says that every active or potentially active volcano should be studied by a geologist to find out what did happen, by a geophysicist to determine what is happening, and by a lobbyist to tell the government what might happen."-- Decker (1986) Pattern recognition has played a central role in all eruption forecasts that are made in response to unrest (elevated seismicity, ground deformation, gas emissions, etc.). Monitoring data are evaluated and, if attributed to magmatism, are used to infer the timing, style, and location of potential eruptive activity based on signals examined previously at that volcano or similar volcanoes. The practice is analogous to noticing clouds starting to form as atmospheric temperature and pressure decrease and inferring that rain is likely. The forecast of rain in that case is not based on any special knowledge of the physics and chemistry of the atmosphere but rather on past experience. These indicators are not universally applicable; for instance, weather patterns in equatorial regions are different than those in polar regions. Nevertheless, recognizing patterns in monitoring data provides a means of identifying the observations that will be most directly useful as predictors of future conditions (such as earthquakes repeatedly occurring in a particular location prior to eruptions), and it allows for first-order interpretations of process. Even though the approach can be simple, it often works.

Monitoring Data and Data Gaps
The key to intermediate-or short-term forecasting of the onset of eruptions first involves recognizing the signs that volcanoes often display before eruptions begin. Monitoring of active volcanoes over the past several decades has revealed numerous manifestations of volcanic unrest that sometimes culminate in eruptions, including seismicity, ground deformation, gas, and thermal emissions. These indicators are present at different stages of preeruptive unrest as magma accumulates at depth and then ascends toward the surface ( Figure 4). In this section we focus on forecasting the onset of eruptions, but many of the ideas are similar for forecasting eruptive styles and durations , which may represent even bigger challenges for volcanology.
The earliest signs of potential future eruption might be (1) elevated CO 2 emissions, since CO 2 exsolves at relatively great depths and may ascend more rapidly than its source magma (e.g., Poland et al., 2012); (2) deep long-period (DLP) earthquakes, often believed to be a result of magma migration (e.g., Power et al., 2004;White, 1996); and (3) aseismic inflation, caused by magma accumulation (e.g., Wicks et al., 2002). These signs may all reflect magmatic activity at up to tens of kilometers depth (Dzurisin, 2003). Changes in water chemistry and thermal output may also accompany deep magmatic activity as volatiles released from accumulating magma migrate toward the surface (e.g., Evans et al., 2004). As magma enters shallow storage areas that are perhaps only a few kilometers beneath the surface, unrest may intensify and include inflationary ground deformation, volcano-tectonic (VT) seismicity (which may occur along the magma pathway or on distal faults stressed by the magmatic, and potentially hydrothermal, systems; e.g., Coulon et al., 2017;White & McCausland, 2019), ground cracking, glacier melting, and emission of sulfur gases (Brown, Loughlin, et al., 2015). "Wet" volcanoes with large groundwater systems may show an evolution from H 2 S to SO 2 emissions as groundwater is driven off by rising magma (Symonds et al., 2001). As a result, the onset of significant SO 2 degassing and other highly soluble gas species like halogens (e.g., Edmonds et al., 2008), phreatic explosions, and increases in water discharge from springs (e.g., Johnson, Valentine, et al., 2018) are often some of the last manifestations of unrest prior to the arrival of magma at the surface. Late stages of preeruptive volcanic unrest may also include changes in potential fields and seismic velocity (e.g., Brenguier et al., 2008).
Do any volcanoes actually show most of these signs prior to magma reaching the surface? In some cases, yes! The March 2009 eruption of Redoubt, Alaska, provides a prime example ( Figure 5). The months before the eruption onset were characterized by elevated CO 2 emissions (Werner et al., 2013), thermal activity (Wessels et al., 2013), DLPs (Power et al., 2013), and broad inflation (although the deformation was only recognized in hindsight; Grapenthin et al., 2013). As magma ascended toward the surface in the days to weeks before the eruption, shallow volcano-tectonic earthquake activity intensified, SO 2 emissions increased, and a phreatic explosion took place (Bull & Buurman, 2013;Power et al., 2013;Werner et al., 2013).
For every eruption that has displayed clear precursory unrest, however, there have been many more which erupted with few, if any, detected warning signals-for example, there was only a small increase in volcanotectonic seismicity months before, and minor seismicity in the few hours prior to, the 2015 VEI 4 eruption of Calbuco, Chile ( Figure 2b; Castruccio et al., 2016;Delgado et al., 2017). This eruption is noteworthy for breaking the usual trend of forecasting success for volcanoes that awaken from decades of repose (Calbuco's previous eruption was in 1972) with large explosions (e.g., Cameron et al., 2018). Along the same lines, some volcanoes show anomalous signals (e.g., increased seismicity, rapid uplift) that do not culminate in eruption-at least, not immediately. Cotopaxi, Ecuador, experienced deformation and seismicity in 2001-2002 that were interpreted as due to magma accumulation (Hickey et al., 2015), but the volcano subsequently went back to sleep until similar unrest culminated in a small eruption in 2015 (Morales Rivera et al., 2017). Different volcanoes exhibit different styles and spatiotemporal patterns of preeruptive unrest, likely making some eruptions-for example, from open-system volcanoes that are frequently active and experience small eruptions (e.g., Cameron et al., 2018)-difficult or impossible to forecast with any realistically feasible monitoring network. Furthermore, major monitoring gaps exist at most volcanoes worldwide due in part to challenging logistics and extreme costs that make it impossible to cover all volcanoes on Earth with dense (or even sparse) networks of ground-based instruments. Brown, Loughlin, et al. (2015) estimated that 25-45% of historically active subaerial volcanoes on Earth are unmonitored by ground-based instrumentation, and many others are monitored by only a few nearby seismometers. Ewert et al. (2005) found that most of the volcanoes in the United States are not monitored at a level that is commensurate with the threat they pose to populations and infrastructure.
Remote sensing can help to close this monitoring gap by providing thermal, gas, and deformation data for any volcano on Earth at little cost to the user. The ever-improving temporal, spatial, and spectral resolutions of satellite data show exceptional promise for detecting not only volcanic eruptions but also their precursors. For example, Furtney et al. (2018) reported that about half of all cases of satellite-detected deformation associated with eruptions occurred before eruption onset; many of these instances were not detected by ground-based instruments. Changes in thermal radiance have proven to be a good indicator of future eruptions at Bezymianny, Kamchatka (van Manen et al., 2013) and can also be used to anticipate the end of an eruption (Bonny & Wright, 2017;Coppola et al., 2017). Likewise, satellite-derived deformation has been shown to be a good indicator of eruption, with a strong correlation between volcanoes that deform and those that erupt (Biggs et al., 2014). In a study of the 47 most active volcanoes in Latin America, Reath et al. (2019) found that many eruptions at volcanoes with dense satellite data sets were preceded by thermal, gas, and/or deformation precursors-and in some cases, all three ( Figure 6). Satellite data can never replace ground-based instrumentation for reliable, real-time volcano monitoring, but has demonstrated a capability to inform an understanding of unrest (especially at poorly monitored areas).
Technological developments (in both instrumentation and methodology) also aid efforts to monitor volcanoes for signs of future eruptive activity. For instance, gravity measurements have long been used to measure subsurface changes in mass, and there are numerous instances of gravity change at volcanoes that provide insight into subsurface processes. Combined with information from other disciplines (e.g., seismicity, gas emissions, thermal changes), such information can be used to assess whether activity is likely to culminate in an eruption or a change in eruptive style (Carbone et al., 2017). Advances in gravimeter design suggest that continuous measurements may soon be possible from networks of sensors at relatively low cost (Middlemiss et al., 2017). Similar advances in instrumentation have also aided the field of gas geochemistry, increasing temporal resolution and expanding the number of species that can be measured (e.g., Lewicki et al., 2017). Using such data, Battaglia et al. (2019) were able to identify gas signatures associated with the formation of hydrothermal seals that precede phreatic eruptions-a finding with potential for monitoring active volcanic lakes worldwide and anticipating hazardous events. In seismology, data processing methods are constantly evolving and provide a diversity of means for interpreting volcanic earthquakes. Bennington et al. (2018), for instance, demonstrated the ability to measure subtle preeruptive changes in seismic velocity at depth using a single seismometer at Veniaminof, Alaska. Future innovation in sensor design and analysis methods for all disciplines will continue to result in the recognition of ever-subtler signs of volcanic unrest.

Pattern Recognition: Qualitative Approaches
In this section, we discuss the qualitative use of patterns in monitoring data to forecast eruptive activity. These forecasts are based on quantitative data and may be quantitative themselves, but they are not produced by mathematical models, instead making use of current monitoring data and comparison with past events. Although conceptual process models are sometimes invoked to understand the observed patterns, the forecasts are based fundamentally on recognizing patterns rather than understanding their physical basis.
As previously mentioned, one of the best-known examples of eruption prediction (in terms of a definitive statement of the timing, location, and style/magnitude of a volcanic event, albeit with some uncertainty in the exact onset time) is that of Mount St. Helens during dome-building eruptions in 1980-1986. During that activity, volcanologists quickly recognized that increasing rates of seismicity and ground deformation (Figure 7) preceded lava extrusion, allowing for accurate predictions of dome-growth events (Swanson et al., 1983).
Even at volcanoes that do not erupt as predictably as Mount St. Helens did in the early 1980s, pattern recognition in monitoring data can yield excellent results. As discussed above, forecasts of eruptions at Mauna Loa based on statistical analysis of the historical and geological records have been of mixed utility, but forecasts based on trends in monitoring data have been more successful. Koyanagi et al. (1975) speculated that Mauna Loa might be reawakening from a 25-year repose based on increased earthquake rates and extension across the caldera starting in 1974 ( Figure 8); the eruption occurred just after the manuscript was prepared. The same indicators were detected in the early 1980s, leading Decker et al. (1983) to suggest that an eruption might occur within two years. Mauna Loa erupted the year after that manuscript was published. As a general rule of thumb, forecasting the onset of eruptions at closed-conduit volcanoes-that is, volcanoes that experience years to decades or more of repose between eruptions -has been more successful than at open-conduit or more frequently active volcanoes. For example, the Alaska Volcano Observatory has a high degree of success at forecasting eruption onset at volcanoes that have been in repose for more than 15 years or that experience eruptions of VEI 3 or greater (Cameron et al., 2018), based largely on the recognition of seismic rate increases (Pesicek et al., 2018).
As also inferred on occasion from the geological or historical record (section 3), monitoring data for some volcanoes suggest "inflationpredictable" behavior (Segall, 2013), meaning that eruption occurs when the inflation of the magmatic system (determined from ground deformation measurements) reaches a particular threshold. Axial Seamount is an exceptional example, with eruptions in 2011 and 2015 occurring when uplift had matched or just exceeded the subsidence of the preceding eruption; on this basis, the 2015 eruption was accurately forecast (Chadwick et al., 2012;Nooner & Chadwick, 2016). Apparent inflation-predictable behavior has also been recorded during some time periods at Hekla and Krafla volcanoes in Iceland (Sturkell et al., 2006;Tryggvason, 1980). This type of pattern is far from infallible, however, requiring, among other conditions, elastic behavior of the host rock around a magma system with time-invariant geometry and stress regime (Segall, 2013).
Occasionally, ground-based data enable forecasts on time scales as short as minutes. This may seem like insufficient time to be actionable, but as Earthquake Early Warning demonstrates, even seconds can be enough to find shelter in advance of an imminent hazardous event (e.g., Allen & Kanamori, 2003). The 1991 and 2000 eruptions of Hekla were preceded by~30 min of recognizable seismicity and strain change, and in 2000 the observations from 1991 made it possible to recognize the signs of rapid magma ascent and impending eruption (Linde et al., 1993;Sturkell et al., 2006). Similar short-term warnings of explosive eruptions can be achieved using infrasound, as at Villarica, Chile (Johnson, Watson, et al., 2018); the method has also been implemented as an early-warning system with an outstanding success rate at Etna (Ripepe et al., 2018). There is also evidence that eruption locations might be forecast on the basis of patterns in deformation data (Guldstrand et al., 2018).
Pattern recognition can also lead to situations where the expected or most-likely scenario does not occuran intrinsic and unavoidable aspect of forecasting. Inflation at Hekla measured at a distant tilt station had the appearance of inflation-predictable behavior based on eruptions in 1991 and 2000, but inflation since 2000 has thus far exceeded the deflation associated with those eruptions (Sturkell et al., 2013). The record of deformation and shallow seismicity at Mauna Loa shows a number of increases in earthquake and Helens crater as measured during July-September 1981. A dome-building eruption (red dashed line) occurred in September and was preceded by an increasing contraction rate across the fault. The arrow indicates the time that the eruption prediction was issued (based on deformation, seismicity, and other data), and the gray rectangle is the time period during which the eruption was predicted to occur. Modified from Swanson et al. (1983). inflation rates since 1984, but none have (yet) culminated in an eruption (Figure 8). At some point, of course, an eruption will occur at both of these volcanoes, but future eruptions need not follow previous patterns. Recognizing those patterns that are associated with eruptions versus those that are not is the greatest challenge in applying pattern recognition-a challenge with exceptional consequences. In 1976, most of the island of Guadeloupe (72,000 people) in the eastern Caribbean was evacuated due to activity at La Soufrière volcano. Although there ultimately was an eruption, it was minor compared to the tremendous financial and societal costs of the evacuation (Fiske, 1984). But how could the impact of the eruption have been known based only on the patterns determined from monitoring data? Was the evacuation warranted? Misidentifying the potential for eruption could lead to catastrophe, like the 1902 Mont Pelée disaster in which~29,000 people were killed-an event that must have been on the minds of those who responded to the volcanic crisis in Guadeloupe. Ultimately, forecasts must be judged not on the volcanic outcome itself-which will always be uncertain-but rather on whether or not they can be justified given the information available at the time.
Pattern recognition is often effective even when conceptual models invoked to explain the observed patterns are wrong-an important point given our limited understanding of most volcanic systems. An outstanding recent example is the successful forecast of minor explosions at Kīlauea's summit in 2018. Withdrawal of the summit lava lake associated with the lower East Rift Zone intrusion and eruption resembled activity in 1924, which at that time led to small, locally hazardous summit explosions and one fatality. Stearns (1925) suggested that these explosions were driven by groundwater encroaching on the formerly lava-filled conduit and flashing to steam upon encountering hot rock. On this basis, on 9 May 2018, as the lava lake rapidly receded, the U.S. Geological Survey (USGS) Hawaiian Volcano Observatory warned of the possibility of steam-driven explosions in the coming days . Explosions did occur, but analysis of gas data and groundwater models (Hsieh & Ingebritsen, 2019;Neal et al., 2019) indicated that the explosions were likely driven by processes other than groundwater infiltration. The forecast of small explosions at the summit of Kīlauea in 2018 was accurate, but the conceptual model of their causal mechanism was probably incorrect.

Pattern Recognition: Quantitative Approaches
Although subjective approaches are certainly useful, there are advantages to more quantitative models and probability assessments (direct human evaluation of probabilities, for instance, is subject to many potential fallacies). Yet a fundamental challenge remains how to quantitatively relate monitoring data to the probability of future outcomes (e.g., Marzocchi & Bebbington, 2012). In this section, we discuss the use of patterns in monitoring data with statistical and mathematical models that are not directly based on magma physics.
In one approach, specific monitoring parameters (e.g., the presence of SO 2 , earthquake rate, strain rate, fumarole temperature) are used as inputs into predefined and preweighted mathematical functions to compute probability distributions . Thresholds in monitoring parameter values or parameter value changes are assigned-usually by expert evaluation (section 5.1)-based on their indication of the state of the system (i.e., higher levels of seismicity indicate a higher degree of anomalous behavior; Marzocchi et al., 2004;Sobradelo & Martí, 2015). Although there are many subjective decisions to be made in choosing the functions and the weights (thresholds), this approach allows experts to be directly queried about the relation between monitoring anomalies and volcanic activity-their area of expertise-rather than being asked to directly evaluate probabilities (Marzocchi & Bebbington, 2012); also, by establishing the thresholds in advance, a number of potential pitfalls are avoided in trying to subjectively relate monitoring data to eruption probabilities during an evolving crisis. Once probability distributions are obtained using this technique, they can then be used, for instance, within an event tree framework (section 6).
Other approaches interpret changes in terms of failure processes. The failure forecast method (FFM) does not rely upon detailed knowledge of the physical processes driving eruptions, but instead is based on observations and laboratory experiments which suggest that as a material (i.e., rock) is put under strain, it moves toward failure at an accelerating rate. FFM is based on a physical law for failing materials given by € where Ω is an observable quantity (e.g., seismic energy release), A and α are empirical constants, and dots denote time derivatives such that _ Ω is the rate of change of the observable quantity and € Ω is its acceleration (Voight, 1988). For α > 1, solution of the equation yields a power law with a singularity (failure) at finite time (e.g., Boué et al., 2016;Voight, 1988). Power law acceleration of counts, energy, or amplitude in monitoring data such as deformation, geochemistry, and (particularly) seismicity can be viewed as progress toward failure, modeled using this expression, and the results used to forecast the timing of failure-that is, eruption onset (Voight, 1988). FFM has been successfully applied to several eruptions, such as at Pinatubo in 1991 (Cornelius & Voight, 2018), but has a number of important limitations, including the difficulty and subjectivity of choosing appropriate data and empirical fitting parameters (e.g., Bell et al., 2011;Tárraga et al., 2008). In practice, different parameter choices can yield very different forecasts, and forecast time scales are often too short (Tárraga et al., 2008). As a result, FFM has proven more suitable for hindcasting of previous events (e.g.,  than for real-time forecasting. Continuing development of FFM-based techniques should, however, improve its usefulness (Bell et al., 2011;Bevilacqua et al., 2019;Boué et al., 2016;Kilburn, 2018;Salvage & Neuberg, 2016;Vasseur et al., 2015). For a more complete review of FFM we refer readers to Tárraga et al. (2008) and introductions in the more recent papers cited above.
Advances in machine learning show great promise for enabling a new generation of forecasts based on spatiotemporal pattern recognition through semiautomated computer analysis of monitoring data. Machine learning has exploded in popularity and sophistication in recent years, but these techniques are just starting to find broader application to solid Earth geosciences and volcanology in particular (e.g., Bergen et al., 2019). Unsupervised machine learning algorithms seek patterns or structure in raw unlabeled data, while supervised algorithms use labeled data from past activity to teach an algorithm to interpret new observations. These approaches are not based on physics (although physical laws can be utilized to good advantage; e.g., Karpatne et al., 2017;Reichstein et al., 2019) but rather on the ability of algorithms to sift through vast amounts of data and detect patterns that would be difficult for any human analyst to recognize. Applications to all aspects of forecasting are numerous, from improved detection and classification of important signals in noisy data to acceleration of computationally intensive physics-based models through the construction of "emulators" (computationally efficient model replacements). Rouet-Leduc et al. (2017), for instance, argued that a machine learning algorithm was able to recognize that acoustic signals emitted by faulting in a laboratory sample-previously dismissed as noise-could be used to predict the time remaining before failure.
In volcanology, an early application of machine learning (in this case, pattern recognition) investigated preeruptive seismic data during 217 episodes of volcanic unrest around the world . To date, however, machine learning algorithms have seen the widest use not directly in eruption forecasting (see Brancato et al. (2016), however, for an early example), but rather in the related problems of detection and classification of seismic signals (Curilem et al., 2009;Esposito et al., 2006Esposito et al., , 2008Ibs-von Seht, 2008;Masotti et al., 2006;Scarpetta et al., 2005), thermal anomalies (from remote sensing data; Piscini & Lombardo, 2014), and satellite-measured ground deformation (Anantrasirichai et al., 2018). These studies demonstrate the importance of machine learning for "data discovery," which can feed directly into forecasts.
There are, of course, challenges. Machine learning models are often "black boxes"; they might yield an optimal answer, but their internal machinations are often opaque (particularly in the case of deep neural networks, which have multiple layers of hidden nodes), and they generally do not yield any insight into the actual physical processes at work-indeed, they may even predict physically impossible scenarios (Reichstein et al., 2019). The outputs from such models therefore do not much inform our conceptual or physical models (although it is worth noting that machine learning algorithms can be turned to such problems as well), nor can they easily be explained to land managers or the public; it also remains to be seen how the legal system would assign liability for any forecast errors resulting from a machine learning algorithm. Machine learning-based eruption forecasting approaches are also strongly limited by the lack of wellmonitored eruptions that can be used to train the algorithms. Despite these limitations, as data quality and quantity improve, machine learning algorithms advance, and the volcano community becomes more familiar with their application, we expect that machine learning will find broad and possibly transformative application to volcano forecasting.

Magma System Models in Forecasting Volcanic Eruptions
"Forecasting is evolving from empirical pattern recognition to forecasting based on models of the underlying dynamics."--Sparks (2003) To aid interpretation, monitoring data should ideally be considered in the framework of some kind of volcano system model. We consider here: (1) conceptual models (and straightforward kinematic models), which might be guided by mathematical formulations but that have no inherent predictive value and (2) deterministic models based on the physics of magmatic processes, from which the future state of the volcano can be directly computed given its current state.

Conceptual and Kinematic (Nondeterministic) Models
Conceptual models of magmatic systems and processes-describing, for instance, the geometry of a magmatic plumbing system, or relating an observation with a magmatic process-provide essential context in which to interpret observations (Newhall & Hoblitt, 2002;Newhall & Pallister, 2015), such as whether or not magma is the cause of observed unrest . Real-time data may be qualitatively evaluated and leveraged to produce forecasts within the framework of a conceptual model, although volcanologists must be careful to avoid the trap of thinking that their models are perfect representations of reality. For individual volcanoes, conceptual models can range from the very general (for poorly understood systems) to the very specific (for better-studied systems) and can be based on any available information, such as geological observations of eroded volcanoes (e.g., Walker, 1992), simple interpretations of monitoring data (such as an aseismic zone indicating the presence of magma; e.g., Scandone & Malone, 1985;Thurber, 1984), experimental evidence connecting petrologic observations with magmatic sources (e.g., Rutherford et al., 1985), and mathematical models of ground deformation data that indicate the location of subsurface magma sources (e.g., Mogi, 1958). Conceptual models can also be used to qualitatively relate observations with physical processes, such as McNutt's (1996) generic volcanic earthquake swarm model and White and McCausland's (2019) process-based conceptual model of evolving seismicity prior to magmatic eruptions at dormant volcanoes. Mount St. Helens provides an illustrative example. Starting soon after the catastrophic eruption of 18 May 1980, geochemical and geophysical data suggested the presence of an elongated magma reservoir at 7-14 km beneath the volcano that was intermittently recharged by magma from greater depth (Pallister et al., 1992;Rutherford et al., 1985;Scandone & Malone, 1985). Subsequent geochemical and seismic data helped to refine this picture, tightening the depth and probable volume of magma storage (Moran, 1994;Rutherford & Hill, 1993). This conceptual model for the volcano's magmatic system became the basis for 10.1029/2018JB016974

Journal of Geophysical Research: Solid Earth
interpreting the reawakening of the volcano in 2004 and the dome-building eruption of 2004-2008. For instance, models of far-field deformation recorded by continuous GPS during that eruption indicated deflation of a source 7-8 km beneath the volcano, coincident with the previously identified magma reservoir and suggesting that magma was being drawn from that source (Dzurisin, 2018;Lisowski et al., 2008).
Geological observations, geophysical monitoring data, and mathematical models from both quiescent and eruptive periods during the more than 100-year existence of the Hawaiian Volcano Observatory have contributed to numerous conceptual models of Kīlauea's magma plumbing system (e.g., Figure 9). During the volcano's large and destructive 2018 eruption these models provided context in which to interpret monitoring data, assess hazards, and issue forecasts Neal et al., 2019). These models included a strong magmatic connection between the volcano's summit and rift zones, indicating that eruption of lava on the volcano's flanks would be associated with deflation of the summit magma reservoir. Simple kinematic (nonpredictive) models allowed scientists to infer rates of magma withdrawal from the summit and rates of pressure decrease in the magma reservoir, and they placed constraints on discussions of the possibility for failure of the rock overlying the reservoir. As noted in section 4.2, conceptual models of Kīlauea's 1924 eruption (Stearns, 1925) prompted the Hawaiian Volcano Observatory's warning of the possibility of summit explosions and resulted in the closure of Hawai'i Volcanoes National Park. When the summit of the volcano began to collapse, conceptual models of the collapse mechanism-together with quantitative model-derived estimates of the magma reservoir's depth and volume-were utilized to qualitatively assess hazards, such as the possibility of larger collapse-driven explosions.
Conceptual models (of specific volcanoes, specific processes, general behaviors, etc.) provide a context in which to interpret monitoring data and geological observations, which can be used to guide probabilistic assessments of potential future activity via scientific consensus. This expert elicitation approach involves compiling the opinions of individual experts on probable outcomes and, in one form or another, plays a critical role in the great majority of current eruption forecasts (e.g., Aspinall & Cooke, 1998). In perhaps its simplest form, it consists of no more than a group of scientists holding discussions and reaching a qualitative consensus assessment of the likelihood of an event or sequence of events (e.g., "the volcano is unlikely to erupt"). Consensus estimates may also be quantitative, that is, <1% chance of a pyroclastic flow in the next week (see section 6; Newhall & Pallister, 2015). The practice is a useful means of determining where there is general agreement and where there are high degrees of uncertainty. More structured elicitations are also possible. In the Delphi method, at least two rounds of questionnaires are completed individually by experts and then shared with the group, allowing opinions to be revised after participants have seen the results from their peers (see Aspinall (2010) for a brief review of expert opinion solicitation in the context of volcanology). One risk in these styles of elicitation is that group opinion may be unduly weighted by the perceived scientific stature of the individuals (or other factors), rather than the scientific merit of the ideas (Aspinall, 2010). To address these concerns, it is also possible to weight expert opinion using a performance-based calibration score (Aspinall, 2006). In this approach, which is generally known as Cooke's method, a group of experts is asked to individually answer a series of calibration questions with known answers and to provide estimates of their confidence in these answers. The calibration questions are then used to assess the participants' individual expertise and degree of confidence in their answers (neither overconfidence nor underconfidence is desirable), from which scoring weights are computed. These weights are then applied to the expert forecasts in order to produce an unbiased aggregate uncertainty (Aspinall, 2006;Aspinall & Cooke, 2013). Cooke's method is especially useful for focusing on one or a few contentious issues, such as whether or not an episode of unrest will culminate in a hazardous eruption, and for groups that have difficulty reaching consensus due to differing opinions. Regardless of how the opinions are synthesized, however, the simple process of discussing possible future outcomes as a group in an elicitation framework can be extremely useful. Once aggregated, the forecasts resulting from expert elicitation can be shared directly with land managers, help to inform the placement of instruments and hazard zones, or used as input to forecasting frameworks such as event trees (see section 6). Conceptual models provide the foundation upon which these expert judgment forecasts are based.

Deterministic Models and Data Assimilation
Physicochemical (also referred to as "physics-based") volcano models relate fundamental magma and material physics with diverse monitoring data. In contrast to kinematic models, physicochemical models are 10.1029/2018JB016974

Journal of Geophysical Research: Solid Earth
often fundamentally predictive (deterministic): given a set of initial conditions (for instance, the volume, pressure, and volatile content of magma in a reservoir), they can quantitatively predict the future evolution of the system. Forecasts based on the output of physicochemical models capitalize on decades of work to understand and model the physics of magmatic systems. Model-based forecasts are inherently quantitative, are amenable to automation, and, by predicting a diverse range of monitoring data, they allow forecasts to be informed by all available observations. Finally, deterministic volcano models can be incorporated into existing forecasting and data assimilation algorithms (which are widely and operationally implemented in weather and financial market forecasting), thus allowing the volcanological community to leverage decades of experience and research in other disciplines.
Forecasting eruptions using models of the deterministic physical laws governing magmatic processes has been a goal of the volcanology community for decades (Decker, 1973(Decker, , 1986National Academies of Sciences, Engineering, and Medicine, 2017;Segall, 2013;Sparks, 2003), but remains largely unrealized. Volcanoes are dynamic systems with nonlinear governing physics-for example, the interplay between magma ascent rate, degassing, and crystallization in a conduit can drive abrupt and dramatic changes in eruptive activity (e.g., Kozono & Koyaguchi, 2009;Melnik & Sparks, 1999). As a result, even small changes in the current state of the system can result in large changes in expected activity. Since the conditions within a volcano at the present time (its state) must be inferred imperfectly from limited and noisy measurements, and small changes in initial conditions in highly nonlinear (or "chaotic") systems (Marzocchi & Bebbington, 2012;Papale, 2017) can result in radically different future outcomes (Lorenz, 1963), forecasts may diverge very rapidly from reality. This raises the possibility that it may always be fundamentally difficult to forecast volcanic activity using deterministic models (Koyaguchi, 2016), and that, in some sense, volcanoes may be inherently unpredictable (Sparks, 2003).
To make matters worse, the physics of volcanoes is not well understood, geometries and material properties are poorly known, and volcanic processes occur over spatial and temporal scales that span orders of magnitude. These complications require many simplifications to render models mathematically and computationally tractable. Almost all volcano models are thus highly specialized for a particular type of eruptive behavior or volcano. In the context of volcano forecasting, this presents a fundamental challenge: during a crisis, how can scientists know which model to utilize-or which model to set about developing (often itself an impossible task under crisis conditions)-for an eruption that has not yet been observed? This problem is particularly acute for volcanoes that have not erupted in historical time and for volcanoes that erupt with a wide variety of styles, and it sets volcano forecasting apart from, say, hurricane forecasting, in which a given hurricane model might be expected to perform reasonably well for a wide range of different events.
Is the problem hopeless? Meaningful physics-based forecasts of some types of eruptions will likely remain out of reach for the foreseeable future, but for certain problems there are several reasons for optimism. Firstly, deterministic physics-based eruption models are increasing in availability and sophistication and  Poland et al. (2014). The schematic (and accompanying information, like the probable volumes of magma storage within summit reservoirs) was based on a variety of data sets, experience from past eruptive activity, and simple kinematic models, and it was a helpful basis upon which to discuss expected outcomes of Kīlauea's 2018 flank lava effusion and summit collapse. are already capable of predicting a range of observations (e.g., Anderson & Segall, 2011;Mastin et al., 2008). One such model (Mastin et al., 2008) was even used during the 2004-2008 eruption of Mount St. Helens to forecast the final volume of the eruption (Mastin et al., 2009), and a volcano-like chamber-conduit model was used to forecast the duration of the Lusi mud eruption in Indonesia (Rudolph et al., 2011). Models of magma transport that account for a volcano's stress history can even forecast the location of future eruptive vents (Rivalta et al., 2019).
Secondly, although volcanic systems are highly nonlinear, monitoring data collected during eruptions frequently show a temporal evolution that suggests a relatively smooth, quasi-exponential decrease in reservoir pressure and/or eruption rate (e.g., Hreinsdóttir et al., 2014). These observations are consistent, to first order, with very simple models of deflating magma reservoirs in Earth's elastic crust (e.g., Machado, 1974;Scandone, 1979), and these models (or more sophisticated variants; e.g., Mastin et al., 2008) can be used in a straightforward manner to predict future behavior (Blake & Cortés, 2018;Mastin et al., 2009). Furthermore, repeating, often cyclic or periodic behavior is increasingly well recorded (e.g., Anderson et al., 2015;Nooner & Chadwick, 2016;Voight et al., 1998), offering the opportunity to design experiments, carefully collect data, build models, and, importantly, evaluate forecast success. Of course, far more complex and apparently unpredictable behavior also occurs, and ultimately we will only be able to demonstrate an improved understanding of volcanic systems by comparing data and forecasts across a wide range of systems.
Finally, it is now widely recognized that forecasts are subject to considerable uncertainty and must take the form of probability distributions that are conditioned on uncertainty in the data, initial conditions, and even the models themselves. To that end, much recent progress has been made incorporating physicochemical eruption models into probabilistic frameworks that couple the problem of estimating the state of the volcano (e.g., reservoir pressure) with its future evolution. Incorporation of a physicochemical eruption model into a Bayesian Markov chain Monte Carlo (MCMC) inverse framework allowed Anderson and Segall (2013) to utilize data and independent prior information to obtain probabilistic estimates of model parameters such as the volume of the magma reservoir-distributions that can be used as initial conditions to the same deterministic model to probabilistically forecast future behavior (in fact, the parameter estimation and forecasts can be computed simultaneously; Segall, 2013). Sophisticated data assimilation algorithms, particularly the ensemble Kalman filter, are also being applied to volcanic systems (Bato et al., 2017(Bato et al., , 2018Gregg & Pettijohn, 2016;. These approaches allow the state of the volcanic system-and forecasts-to be dynamically updated in near real-time as new observations become available, utilizing techniques widely employed in other disciplines. A fundamental challenge of this work remains, and will likely always remain, accounting for the degree to which the model itself does not accurately represent reality ("model error"). As more physicochemical eruption models become available, however, it will become possible, at least in some cases, to obtain improved forecasts and to put a minimum bound on forecast uncertainty due to model error by intelligently combining the results from ensembles of different models (e.g., Krishnamurti et al., 1999;Orrell et al., 2001;Tebaldi & Knutti, 2007).

Integrative Frameworks for Probabilistic Eruption Forecasting
"We have found it useful to organize questions about volcanic hazard and risk into event trees in which the trunk is the most general, initial event and branches lead to increasingly specific, subsequent outcomes."-- Newhall and Hoblitt (2002) Despite the progress outlined above, there is no "magic bullet" in forecasting volcanic eruptions. No monitoring data can definitively show what will happen next. Volcanoes are extraordinarily complex and nonlinear systems, the geological record is incomplete, experts disagree, data are limited and noisy and often difficult to directly relate to magmatic processes, models are biased, and too few analogous past events are available for study. Volcanologists increasingly find themselves armed with a collection of incomplete information and a toolbox of many useful-but individually imperfect-forecasting tools. Coupled with the stress and rapid pace of some volcanic crises, subjectively evaluating and ingesting multiple sources of rapidly changing and occasionally contradictory information may be neither feasible nor rigorously (quantitatively) defensible (Aspinall & Woo, 2014). Such considerations have risen to the forefront of many scientists' minds following the L'Aquila trial in Italy (Bretton et al., 2015).

10.1029/2018JB016974
In light of these challenges, it seems self-evident that forecasts should utilize all available information-from knowledge of physics, to monitoring data, to expert opinion, to observations of past events and analog systems-in an integrative, quantitative framework. Furthermore, such forecasts must necessarily be probabilistic, accounting as much as possible for uncertainties in all constituent elements of the forecast (e.g., Rouwet et al., 2017).
The introduction of the event tree concept to volcanology was a major step toward these goals. Event trees (Figure 10a) are graphical representations of the spatiotemporal evolution of a volcanic event, extending sequentially through branches from the general (e.g., will the current unrest culminate in a magmatic eruption?) to the specific (e.g., what hazards will affect what areas of the volcano?; Aspinall & Woo, 2014;Connor et al., 2015). Event trees can thus combine both long-and short-term forecasts into a single framework. Probabilities assigned to each possible outcome at each fork (node) in the tree are typically derived from some form of expert elicitation, statistics from analog systems, and/or global databases (e.g., Ogburn et al., 2015Ogburn et al., , 2016, and may involve preweighted functional relations between monitoring data and probabilities (Marzocchi et al., 2008). Event trees can be thought of in a Bayesian sense, in which new information (e.g., monitoring data) modifies prior distributions (e.g., the geological history of the volcano), and Bayesian computations are also often performed to derive conditional probabilities for the individual nodes Newhall & Hoblitt, 2002;although Sheldrake et al. (2017) argue that the structure of event trees is not fundamentally Bayesian). Probabilities at each node in an event tree are often single values, but to better capture uncertainty they may also take the form of probability distributions ("probabilities of probabilities"; e.g., Connor et al., 2015;Marzocchi et al., 2004). To compute the probability of a given outcome, all of the conditional probabilities leading up to that outcome are multiplied together. For instance, the probability that a VEI 3 eruption will occur depends on the probability that the volcano experiences unrest times the probability that the unrest is magmatic in origin times the probability that the magma reaches the surface, etc. If conditional probabilities are distributions rather than single values-the "probabilities of probabilities"-then the calculations are somewhat more complex but can be performed using mathematical relations, software, Monte Carlo sampling, etc. (Marzocchi et al., 2008;Neri et al., 2008).  Newhall and Hoblitt (2002) and Wright et al. (2018). Each branch (or node) in the tree is assigned a probability; multiplying these probabilities across the tree results in the probability of specific outcomes (i.e., that certain hazards will impact specific areas). The first several branches are mutually exclusive (in other words, there will either be a magmatic eruption or not), while the branches dealing with eruption hazards are not (i.e., the probabilities of different hazards from a VEI = 2 eruption, like ash, lava flows, and lahars, are not exclusive). (b) Simple Bayesian Belief Network (BBN), modified from Hincks et al. (2014). In this BBN, the desired output is the probability of eruption (query node). The eruption results from magmatic unrest, which cannot be observed directly but can be inferred from data such as seismicity and deformation.
The advantages of event trees in eruption forecasting were first demonstrated in the 1990s at Pinatubo (Punongbayan et al., 1996) and Soufriére Hills Volcano (Aspinall & Cooke, 1998), and the procedure was formalized by Newhall and Hoblitt (2002), leading to wide adoption in subsequent years. Event trees are flexible, can be deployed rapidly in a crisis, are transparent when detailed notes are kept (critical for accountability), and constituent probabilities may be determined by any suitable method and thus utilize all available information (Newhall & Pallister, 2015). Construction of event trees also focuses discussion about possible causes and outcomes, highlights the possibility of extreme events, guides the collection of monitoring data, and motivates volcanologists to develop models and identify analog systems (e.g., Wright et al., 2018). The event tree approach has been utilized by numerous groups and formalized in a variety of semiautomated tools for visualizing and computing probabilities as an aid for decision makers, scientists, and civil planners (e.g., Marzocchi et al., 2010;Sandri et al., 2017;Sobradelo et al., 2014). For an example of event trees in practice, see Wright et al. (2018) on the development of six separate event trees as activity evolved during 2013-2014 at Sinabung, Indonesia.
Logic trees-as typically utilized in volcanology-are closely related to event trees, but with nodes that represent models instead of events, and with transition probabilities (the probabilities assigned to each node) as weights assigned to the models (e.g., National Academies of Sciences, Engineering, and Medicine, 2017). Logic trees thus allow for the incorporation of alternative scientific views and models of the same process into forecasts, with results weighted by relative confidence in these models (as determined, for instance, by expert elicitation; Coppersmith et al., 2009). By changing the weights, it is easy to evaluate the effect of different models on forecasts. Like event trees, logic trees allow a distinction between aleatoric and epistemic uncertainty and total forecast uncertainties that are conditioned on both (e.g., Marzocchi et al., 2004;Marzocchi & Bebbington, 2012). Logic trees are widely used in probabilistic seismic hazard assessments; for instance, the third version of the Uniform California Earthquake Rupture Forecast utilized a logic tree with numerous fault, deformation, and earthquake-rate models, necessitating a logic tree with more than 1,000 branches (Field et al., 2013).
Rather than describe a time progression, as event trees do, Bayesian Belief Networks (BBNs; Figure 10b) describe physical states of the volcano and are used to infer processes and conditions. A BBN is a graphical representation of the probabilities relating a set of variables; rather like event and logic trees, BBNs are composed of a set of nodes (called features, which take the form of random variables) linked together by lines representing relations between these features (BBNs can even have a tree structure; Hincks et al., 2014). In the context of volcanology, BBNs can be used to represent probabilistic relations between, for instance, visible observations (gas emissions, seismic energy rate, etc.), unobserved processes (ascent of magma into the system, perturbation of the hydrothermal system, etc.), and possible outcomes (lava flow, explosion, etc.; e.g., Aspinall et al., 2003;Cannavò et al., 2017;Christophersen et al., 2018;Hincks et al., 2014;Sheldrake et al., 2017). Each node is associated with a conditional probability that gives its relation to other nodes with which it is connected. These probabilities (which may be continuous or discrete; e.g., high, medium, or low levels of seismicity) may be derived from training data, assigned by experts (Aspinall et al., 2003), or, in principle, computed by models. Once constructed, BBNs may be used to explore relations between cause and effect and to infer unobserved variables. Although BBNs are not inherently dynamical-that is, they do not consider temporal changes in the system-they can be evaluated and updated repeatedly as time progresses to give changing forecasts, like event trees (e.g., Wright et al., 2018). More complex dynamic BBNs that explicitly model time also exist but have seen relatively little application to volcanology (Hincks et al., 2014), a notable exception being the application of a hidden Markov model (a type of dynamic BBN) to data from Montserrat . BBNs have been used in retrospective analysis of data from La Soufrière (Guadeloupe) during -1977(Hincks et al., 2014, during the unrest at Santorini Volcano in 2011-2012 (Aspinall & Woo, 2014), and with monitoring data at Mount Etna (Cannavò et al., 2017). Overall, BBNs achieve good results in terms of recognizing patterns of behavior that can be used both in forecasts and in detecting the onset of hazardous activity.

Bringing It All Together: Forecasting in the Twenty-First Century
"Most volcanic systems are too complex and our understanding of them too rudimentary for precise, unequivocal predictions of eruptions and their consequences."--Newhall and Hoblitt (2002) We have argued in sections 4 and 5 that machine learning algorithms and physicochemical volcano models show potential for transforming our ability to forecast volcanic eruptions, but it is important to remember that both approaches face significant obstacles. Training machine learning algorithms generally requires large quantities of data, which for many volcanoes are simply not available, while physicochemical models are limited by inadequate theory (or ability to model the theory) as well as uncertainty in initial conditions. We argue here that combining the two approaches, and/or incorporating both into probabilistic, integrative forecasting frameworks (such as those described in section 6) may help to overcome these limitations and enable a new generation of forecasting tools that can redefine how volcanic activity is forecast for the rest of the twenty-first century.
As described in section 4.3, machine learning algorithms typically make no use of the physical laws governing the systems under study. "Hybrid" marriages of machine learning and physics-based models are relatively unexplored but might bear fruit in any number of ways (Karpatne et al., 2017;Reichstein et al., 2019). Physics-based models can be used within machine learning algorithms to make their outputs more physically realistic, or, conversely, the output of physics-based models can be corrected using machine learning approaches (Reichstein et al., 2019). Machine learning algorithms (along with other, more statistical approaches not usually considered in the context of machine learning; e.g., Wolpert et al., 2018) can also be trained to emulate the output of a computationally expensive physics-based model (Bayarri et al., 2009;DeVries et al., 2017); these fast surrogate models can then be utilized in ensemble forecasts in ways otherwise impossible-for instance, in the Bayesian ensembles which are critical for physics-based eruption forecasting.
Secondly, we believe that there is great potential in utilizing integrative probabilistic forecasting frameworks (e.g., event trees, logic trees, and BBNs) to couple more traditional sources of probabilistic information derived from the geological record, expert elicitation, etc. with ensemble physicochemical model-based eruption forecasts and/or machine learning algorithms. Such an approach would allow all available information to be used to inform forecasts. Furthermore, by allowing the output of deterministic physicochemical models to be weighted using expert opinion, for instance, it should be possible to mitigate, at least in part, one of their chief limitations: the model uncertainty (error) that may easily lead to dangerously erroneous forecasts. The basic structure of logic trees (including event trees) allows forecasts to utilize conditional probabilities derived from any number of sources, so in some ways such an approach is not conceptually challenging. To date, however, physics-based models are rarely utilized in event trees or BBNs (e.g., Marzocchi et al., 2010;Neri et al., 2008;Sandri et al., 2014;Tierz et al., 2017), and when they are used it is mostly to model specific hazards rather than the volcanic system itself and never (to our knowledge) together with probabilistic estimation of that model's parameters (the state of the volcanic system).
An example may help to illustrate the advantages of such an approach. Consider a deterministic physiochemical model of a dome-forming eruption (e.g., Anderson & Segall, 2011;Mastin et al., 2008) utilized with a Bayesian MCMC or Kalman-filter approach to probabilistically forecast the evolution of an ongoing eruption (section 5.2). The resulting forecasts account (as much as possible) for uncertainty in initial conditions (magma pressure, etc.) but do not account for the degree to which the model itself is incorrect; for instance, the eruption may abruptly transition to a different type of behavior outside the scope of the model. However, if the model-based forecast is itself incorporated into an event (or logic) tree, then these possibilities and their outcomes can be assessed with different tools, such as statistical analysis of analog events (e.g., Ogburn et al., 2015) or expert elicitation. In this way, the physics-based model forecasts can be suitably "weighted," and the combined probabilities carried on to other dependent nodes in the network. In this imaginary event tree, the geological record informs the background (prior) probabilities, monitoring data coupled to triggers (thresholds) or machine learning algorithms inform the likelihood of eruption, and Bayesian exercise of the physics-based model together with observations of past events and expert judgment informs probabilities associated with the course of an ongoing eruption ( Figure 11). Following this approach, monitoring data are used to constrain the state of the deterministic physics-based model (or models), from which forecasts are naturally derived; this is in contrast to existing event trees in which monitoring data affect probabilities through expert opinion or their influence on certain threshold functions (although the two approaches are not necessarily mutually exclusive in the context of a larger probabilistic forecasting framework). Furthermore, the output of the physics-based models represents not only the forecast itself (e.g., eruption duration) but also can be used as input to other event 10.1029/2018JB016974 Journal of Geophysical Research: Solid Earth tree nodes. For instance, the probability of a collapse-driven pyroclastic flow will increase as a lava dome grows in volume, which is itself an output of the model. Such a complex forecasting framework will not be appropriate in all cases-during a rapidly evolving crisis, expert elicitation (which may still be based on knowledge of patterns and models) may be the only viable approach (e.g., Wright et al., 2018). Simpler approaches are also easier to explain to land managers and the public. But with automated pattern recognition routines in place and scenarios prepopulated with appropriate models, it may still be possible to utilize the full potential of these forecasting tools on short notice. As with any undertaking, preparation is key.

Conclusions and Special Challenges
"Volcanic systems are highly non-linear and in certain respects may be inherently unpredictable."--Sparks (2003) Figure 11. Conceptual schematic of an event tree which incorporates many of the elements discussed in the text. Probabilities propagate from left to right, with conditional nodal (decision) probabilities determined by the method(s) listed in red text (these are representative for this example and should not be considered complete). For some paths, probabilities can be propagated all the way to hazard models such as Ash3D for ash deposition (Schwaiger et al., 2012), LAHARZ for lahars (Schilling, 1998), Titan2D for pyroclastic density currents (PDCs; Pitman et al., 2003;Patra et al., 2005), and DOWNFLOW for lava flow modeling (Favalli et al., 2005). Green triangles indicate where additional outside information (from expert elicitation or any suitable technique) would be used to inform the uncertain parameters of these models. Ensembles of these models would be run to produce probabilistic output (for illustrative purposes, we ignore challenges associated with running ensembles of computationally expensive models). The event tree also includes a physicochemical eruption model utilized in a Bayesian data inversion/assimilation algorithm, here for a simple exponentially decaying dome-forming eruption (Anderson & Segall, 2011, 2013. The Bayesian priors for this model can incorporate information from elsewhere in the network, and the predictive probability density functions produced by this model for eruption duration, dome volume, etc. (Segall, 2013) are not only directly useful, but can also be used to inform other nodes (dashed lines). For instance, the predicted volume of the dome directly relates to the maximum size of possible collapse PDCs. A key point is that the physicochemical eruption model is not assumed to be "correct"; an additional node accounts for more complex behavior, with the relative likelihood of the two outcomes determined by global databases, etc. Note that branches further to the right are not mutually exclusive (for instance, both ash and lava flows may be erupted).

Journal of Geophysical Research: Solid Earth
Despite much progress, volcanoes still erupt with no detected precursors, volcanoes do not erupt when we think they will, and, with the exception of certain repeating behavior, arguably no successful forecast of eruption style, magnitude, and duration has ever been made (although work linking magmatic properties to monitoring data may begin to close this gap; Cassidy et al., 2018). Our inability to forecast the onsets and evolutions of eruptions is so fundamental that the National Academies of Sciences, Engineering, and Medicine (2017) have described it as a "grand challenge" of volcanology. Yet as outlined in this paper, science is advancing rapidly. Are we on the cusp of a new era of eruption forecasts-leveraging everimproving data together with emerging techniques such as machine learning and physicochemical models-or will eruption forecasting always remain a challenging and even sometimes impossible undertaking? Perhaps both points of view are correct.
It is certainly an exciting time to be in the field of eruption forecasting. Monitoring data-the raw material for most short-to medium-term forecasts-are constantly improving in spatial and temporal resolution due to both on-the-ground networks and remote sensing observations. For instance, ESA's Sentinel-1 satellites have led to the near-ubiquitous availability of InSAR data, and the development of new, inexpensive spectroscopic methods for measuring volcanic gases (such as Multi-GAS; Aiuppa et al., 2007) has resulted in more continuous gas measurements at volcanoes around the world. Although many important measurements, such as time-variable eruption rate, are still rare, we hope that these limitations spur an increased emphasis on collecting more types of data from more volcanoes and making those data available to the global scientific community. We also expect that as more data become available, automated routines for detecting transients and providing pattern-based forecasts will improve.
Physicochemical volcano models are a natural bridge between observations and interpretations and hint at a future in which eruptions can be forecast using fundamental physical principles, while related hazards models allow quantitative estimation of impacts, such as those from debris flows. Already, physicochemical eruption models are showing their value when used with ensemble forecasting algorithms in hindcasting exercises. New models will allow for investigation of a greater number of preeruptive and coeruptive processes, could encompass a variety of data and uncertainties for use in probabilistic forecasts, and would better allow for evaluation of how model formulation affects forecasts.
Along with tremendous possibilities, however, come tremendous risks, and the ultimate question is probably not whether we have the mechanics to perform model-based forecasts, but whether the many limitations in our ability to understand and model volcanic systems render such forecasts meaningless, or even dangerous. Although ensemble forecasts can help to account for uncertainty in initial conditions, we strongly caution against the naive use of physicochemical models as forecasting tools unless model inadequacy is carefully considered. In many cases, it will be difficult or impossible to derive a useful forecast from a volcano model. Nonetheless, there remains cause for optimism. Volcanoes exhibiting repeating events or quasi-exponential behavior may be amenable to model-based forecasting approaches, and utilization of multimodel ensembles and/or incorporation of the models into larger probabilistic forecasting frameworks may help to mitigate some of these limitations.
Few developments promise to revolutionize eruption forecasting-along with so many other branches of science-as does machine learning. Inadequate "training" data have strongly limited the use of machine learning in eruption forecasting at most volcanoes, but the problem may be partly alleviated by exploiting multiple data streams. Volcanologists would do well to take advantage of the tremendous advances in artificial intelligence taking place largely in other fields and should be prepared to adapt these to our own needs. Machine learning algorithms are not a forecasting panacea, however; their inner workings are often largely opaque and their outputs can be misleading or even unphysical. As with forecasting using physics-based eruption models, we caution against placing too much faith in machine learning forecasts. Merging datadriven machine learning algorithms with physics-based eruption models may help to address some of these limitations.
Encouragingly, volcano forecasts and hazards assessments have become far more probabilistic in recent years. Of course, probabilistic forecasts come with their own set of challenges, including how to communicate uncertainties with land managers and the public (e.g., Dolye et al., 2014), how to think about and account for the uncertainty in uncertainties (Sobraledo & Martí, 2017), and how to evaluate success. This latter challenge is critical because, in some regards, a probabilistic forecast cannot be "wrong" unless the 10.1029/2018JB016974

Journal of Geophysical Research: Solid Earth
probability of an event is incorrectly assigned 0 or 100% (even very unlikely events do occur in nature). It is only after performing many probabilistic forecasts and comparing with actual outcomes that we can assess overall performance (Newhall & Pallister, 2015;Rouwet et al., 2017). As probabilistic forecasts become more common, it will be vital to perform such analyses and update our approaches as needed.
Much of the volcanological community has settled on some form of probabilistic forecasting frameworkevent trees in particular-within which to incorporate geological history, real-time monitoring data, analog systems, and conceptual and quantitative models in order to forecast future eruptive behavior. We encourage the incorporation of new materials into these existing "plumbing systems"-machine learning algorithms, new types of data, and the growing library of physiochemical models of volcanic processes. This approach allows for a modular form of forecasting that is constantly taking advantage of developing tools/technology and refined understanding of how volcanoes work. Implementation may be challenging, but we are optimistic for progress given that this approach directs all aspects of volcanology-geological mapping, monitoring, modeling, etc.-toward a single goal. All disciplines can contribute, and so all scientists working on volcanic problems have a role to play.
It may never be possible to forecast every eruption on a time scale and with a degree of confidence that is useful to society, but we believe that great progress is on the horizon. The volcano observatory system, championed and implemented by scientists like Thomas Jaggar, is the ideal setting for realizing a new vision of probabilistic eruption forecasting (Pallister et al., 2019). Observatories and their partners establish monitoring networks and interpret the geological record, and they are usually tasked with warning civil authorities of the potential for hazardous volcanic activity-a critical responsibility in an age of growing exposure of populations to volcanic processes. We are confident that the years to come will see continued expansion of monitoring networks, further automation of pattern-recognition routines, and development of physicsbased models. When integrated at the volcano observatories responsible for issuing warnings of eruptive activity, this approach offers great hope for increasingly accurate and timely forecasts of the onset and evolution of hazardous volcanic eruptions.