On the lognormality of historical magnetic storm intensity statistics: Implications for extreme-event probabilities
Abstract
An examination is made of the hypothesis that the statistics of magnetic storm maximum intensities are the realization of a lognormal stochastic process. Weighted least squares and maximum likelihood methods are used to fit lognormal functions to −Dst storm time maxima for years 1957–2012; bootstrap analysis is used to established confidence limits on forecasts. Both methods provide fits that are reasonably consistent with the data; both methods also provide fits that are superior to those that can be made with a power-law function. In general, the maximum likelihood method provides forecasts having tighter confidence intervals than those provided by weighted least squares. From extrapolation of maximum likelihood fits: a magnetic storm with intensity exceeding that of the 1859 Carrington event, −Dst ≥ 850 nT, occurs about 1.13 times per century and a wide 95% confidence interval of [0.42, 2.41] times per century; a 100 year magnetic storm is identified as having a −Dst ≥ 880 nT (greater than Carrington) but a wide 95% confidence interval of [490, 1187] nT.
Key Points
- Storm occurrence might be a lognormal process
- Storm occurrence is not well modeled as a power-law process
- Confidence limits on forecasts remain wide due to few data
1 Introduction
A magnetic storm can be understood as the causal response of the Earth's coupled magnetospheric-ionospheric system to the variable and dynamic action of the solar wind [e.g., Cowley, 1995]. Scientific analysis of magnetic storms leads to improved fundamental understanding of the Earth's surrounding space weather environment [e.g., Prölss, 2004]. Applied analysis of magnetic storms enables assessment and mitigation of space weather-related hazards. Magnetic storms are associated with disruptions of over-the-horizon radio communication, degradation in the accuracy of global positioning systems, damage to satellite electronics and increased orbital drag, interference with geophysical surveys, and the induction of uncontrolled currents in electric power grids that sometimes cause blackouts [e.g., Daglis, 2004]. Of particular concern are rare but extremely intense magnetic storms [e.g., Hapgood, 2011], the effects of which [e.g., Kappenman, 2012; Cannon et al., 2013] could have widespread deleterious economic impact [e.g., Baker et al., 2008].
The space weather index Dst is a measure of the longitudinal average of low-latitude, ground-level magnetic disturbance. It is useful for identifying the evolutionary phases of individual magnetic storms [e.g., McPherron, 1995; Loewe and Prölss, 1997] and for compiling statistics on storm occurrence rate and intensity. For example, the arrival at Earth of a coronal mass ejection compresses the dayside magnetopause, intensifying its eastward directed current and generating a positive perturbation in Dst. With connection of the interplanetary magnetic field onto the geomagnetic field and solar wind driving of magnetospheric convection, energy can be loaded into the westward directed magnetotail current, which generates a negative magnetic perturbation in Dst. During a magnetic storm, electric currents can be induced in the ionosphere, and during a substorm, magnetospheric field-aligned currents flow into and out of the ionosphere; these currents generate magnetic field perturbations that are recorded in Dst. The main phase of a magnetic storm is defined by a global-scale decrease in low-latitude geomagnetic field intensity [e.g., Gonzalez et al., 1994] that is generated by the westward directed magnetospheric ring current, the intensity of which is approximately proportional to the corresponding negative perturbation in Dst. A storm main phase can persist for a day or two, then, with diminution in solar wind forcing, the ring current intensity eventually dissipates, low-latitude magnetic disturbance decreases, and Dst returns to its near-zero prestorm baseline.
From analysis of historical Dst time series data (1957–2012), we quantify the occurrence rate of magnetic storms as a function of main phase, storm maximum intensity. A plausible statistical model is required to make extrapolated estimates of the occurrence rate of storms more intense and rarer than those observed since 1957. For this purpose, a lognormal statistical process is proposed as a hypothetical model of storm time Dst statistics. We examine the motivation for assuming a lognormal process, we test a lognormal function for consistency with the Dst data, and we compare and contrast the lognormal model with a power-law model. Results contribute to fundamental understanding of extreme space weather events [e.g., Koons, 2001; Crosby, 2011], and especially intense geomagnetic storms [e.g., Tsubouchi and Omura, 2007; Thomson et al., 2011; Cliver and Dietrich, 2013]. This work is motivated by United States domestic agency [National Science and Technology Council, 2015] and international agency [e.g., Schrijver et al., 2015] priorities and strategic plans for the assessment and mitigation of space weather hazards [e.g., Love et al., 2014].
2 The Dst Time Series
The standard Dst index is provided by the Kyoto World Data Center in Japan. It is derived from hourly mean, horizontal intensity geomagnetic time series collected at four observatories [Sugiura and Kamei, 1991]: Hermanus (South African National Space Agency), Kakioka (Japan Meteorological Agency), Honolulu (U.S. Geological Survey), and San Juan (U.S. Geological Survey). A geomagnetic disturbance time series for each observatory is estimated by subtracting a nonstorm, quiet time baseline. The Dst time series is an average of the individual disturbance time series from the four observatories. We use “definitive” Dst values derived from calibrated magnetic observatory data. The Kyoto Dst index is “quantized” to the nearest 1 nT. Note that we have chosen not to use a version of Dst corrected for magnetopause currents generated by solar wind pressure [e.g., Burton et al., 1975] nor have we corrected for tail current [e.g., Turner et al., 2000]. While such corrections, and others that might be proposed, are interesting and important, our focus, here, is more strictly phenomenological; we are concerned with total magnetic storm intensity as measured by Dst regardless of the detailed cause of storm intensity.
In Figure 1 we show the Dst time series for the years 1957–2012, a period of time that encompasses six complete solar cycles: from the rise phase of solar cycle 19 through to the rise phase of cycle 24. Although this duration does not stretch from one sunspot minimum to another minimum, all of the solar cycle phases (rise, maximum, decline, and minimum) are represented approximately equally; thus, we have little bias in our amalgamation of Dst statistics across solar cycles. For 1957–2012, there are 490,896 separate hourly Dst values, and there are no gaps or obviously erroneous values. We subsample the Dst time series for storm maxima −Dst values (these particular Dst values are negative): paying attention to the identification of both intense magnetic storms (which are rare) and weak magnetic storms (which are numerous), we use a simple algorithm that ranks and winnows the Dst data, counting all storm maxima −Dst values greater than or equal to 63 nT, a convenience corresponding to a bin boundary from among the 10 that cover a decade of range in −Dst. Storms with maxima less than this threshold can be difficult to distinguish from intense magnetic storms having multistep evolution or from occasional periods of general magnetic disturbance that might precede or follow a storm.

We note some specifics. Severe geomagnetic activity was realized over about one and a half days during the 13–14 March 1989 storm [e.g., Allen et al., 1989], but in compiling storm maxima statistics, this storm is counted only once with a maximum −Dst value of 589 nT; similarly, the 15–20 July 1959 storm is counted only once with a maximum −Dst of 429 nT, etc. For the years 1957–2012, we identify 1051 separate storms having maximum −Dst≥63 nT; of these, 68 storms have a maximum −Dst value exceeding 200 nT, 21 exceed 300 nT, and only 5 exceed 400 nT. For convenience, we denote these storm maximum values as −Dstj, with each value corresponding to the universal time hour tj of a storm's maximum intensity. A complete list of the storm maximum values is given in the supporting information for this article.
3 Lognormal Statistics






We examine the hypothesis that the −Dstj data can be treated as a realization of a lognormal stochastic process [e.g., Pulkkinen et al., 2008, paragraph 29]. Roughly speaking, multiplicative processes operating in three different domains affect the statistics of magnetic storms. First, the solar cycle results from autoregressive dynamo action, an ∼11 year quasiperiodic amplification and deamplification of poloidal solar magnetic energy [e.g., Solanki et al., 2000] that is manifest as a modulation of sunspot number, coronal mass ejections, and geomagnetic storm occurrence rate. As an aside, we note that statistical models for periodically modulated phenomena can be parameterized in terms of an integrated average rate of occurrence [e.g., Cox and Lewis, 1966, chapter 2.2.v]. Second, the geoeffectiveness of solar wind-magnetosphere coupling is often calculated from multiplicative combinations of solar wind velocity, density, and interplanetary magnetic field direction and intensity [e.g., Newell et al., 2007]. While these variables are not, themselves, statistically independent, satellite data acquired over multiple solar cycles show that solar wind density and dynamic pressure are approximately lognormally distributed [e.g., Veselovsky et al., 2010]. Third, the dynamic evolution of a magnetic storm can be modeled as an autoregressive filter of solar wind forcing and an integration of previous recent states of the magnetospheric-ionospheric system [e.g., Vassiliadis et al., 1999]. Considering, then, all these physical processes together, the multiplicative central limit theorem could be of relevance. We emphasize, however, that optimism should be tempered with recognition that the three physical domains do not act independently to determine magnetic storm intensity. In the end, any practical assessment of the success or failure of the lognormal hypothesis will be largely dependent on its consistency with the −Dstj data.
4 Fitting Models to Data




We use a bootstrap method [e.g., Efron and Tibshirani, 1993] to estimate the errors of extrapolated storm occurrence rates and intensities. For this, the (measured, “mother”) −Dstj data are treated as a population distribution from which samples are drawn. We randomly sample with replacement from the −Dstj data, assembling numerous empirical data sets, each having size equal to that of the mother data set. We assume, as is standard, that each empirical data set is a plausible sample of magnetic storm intensities that could have been realized in the past and which might be realized in the future. We fit each empirical data set with a separate lognormal function using either of the described maximum likelihood or weighted least squares methods. From the set of bootstrap empirical fits, then, we estimate confidence intervals on predicted occurrence rates and intensities.
5 Results
In Figure 2a we show histograms (black) of binned densities for storm maxima −Dstj ≥ 63 nT values from years 1957–2012, and in Figure 2b we show the corresponding exceedance cumulatives (black). We also show lognormal model functions fitted by maximum likelihood (blue) and weighted least squares (red) methods; model parameters are listed in Table 1. Before we continue, we remark that these model parameters can be used to make extrapolated estimates of storm occurrence rates and intensities, but the accuracy of such extrapolations is poor for storms that are far out in the extreme-event tail of the distribution, having intensities substantially greater than the measured maximum of −Dstj = 589 nT; in section we report confidence intervals. Similarly, since the expected value E(x) is far below the cutoff of 63 nT, the parameters μ and σ are sensitive functions of the estimation method, and detailed interpretation of their values should be made with caution.

A | E | S | |||
---|---|---|---|---|---|
Method | (Number/yr) | (nT) | (nT) | p(KS) | p(χ2) |
ML | 3168.58 | 5.42 | 11.79 | 0.0201 | 0.3504 |
WLS | 82.99 | 45.64 | 44.49 | 0.0000 | 0.9999 |
- a To convert to the parameterization used in this analysis,
and
.
Judged visually, fitted lognormal model functions seem to provide reasonably good representations of the binned −Dstj data. In more detail, however, the maximum likelihood fit is very good for the lowest energy bin, 63–79 nT, while the least squares fit is not so good for that bin. On the other hand, the maximum likelihood fit is not so good for the highest energy bin, 501–631 nT, for which there is only one count, while the least squares fit is very good for that bin. The χ2 measure of relative discrepancy [Press et al., 1992, chapter 14.3] between the maximum likelihood fit and the binned data, equation 9, would be a moderately likely statistical realization of random data, the significance probability p = 0.3504; the least squares fit would be a very likely statistical realization of random data,
. On this basis, we cannot confidently reject the null hypothesis that the data are realized from a lognormal process. Similar observations pertain for comparisons of the model fits with the data cumulatives. The Kolmogorov-Smirnov D measure [Press et al., 1992, chapter 14.3] of the greatest absolute discrepancy between the maximum likelihood fit and the data cumulatives would not be an especially likely statistical realization of random data, the significance probability p = 0.0201, but it would be very unlikely for the least squares fit,
. In contrast to the evaluation of the binned data, in judging the cumulatives we might be tempted to reject the null hypothesis that the data are realized from a lognormal process.
It is worth emphasizing that the preceding formal and somewhat inconsistent assessments of significance should be regarded with a practical awareness of the considerable variability in −Dstj that accompanies the solar cycle waxing and waning of sunspots and the variable geoeffectiveness of the solar wind. In Figures 2a and 2b we show, respectively, the data densities and cumulative for each of the six individual solar cycles (gray). Among these, the rise of cycle 19 to the rise of cycle 20 (1957–1966) corresponds to a high number of sunspots (solar maximum: 201.3), while the rise of 20 to rise of 21 (1967–1977) corresponds to a low number of sunspots (solar maximum: 110.6); the present solar cycle (24) is not complete at the time of this writing. The solar cycle subsets show substantial dispersion about the densities and cumulatives that are based on all the data (black). Indeed, some of the misfit between the data cumulative and the lognormal model fits, for example, for bin 251–316 nT, might be attributed to magnetic storms realized during a single solar cycle. More generally, given the dispersion seen in the data for individual solar cycles, the lognormal fits obtained by either maximum likelihood (blue) or least squares (red) methods might be considered rather satisfactory representations of −Dstj.
6 Power-Law Statistics



Nonetheless, we might still be curious as to why the −Dstj data are not well represented by a power-law function. Some physical systems that exhibit power-law statistics can be described in terms of self-organizing criticality (SOC) [e.g., Turcotte, 1999; Aschwanden, 2011]. In particular, the detailed time evolution of individual magnetic substorms caused by collapses of the magnetotail has sometimes been described in terms of SOC dynamics [Angelopoulos et al., 1999; Chang, 1999]. Both numerical simulations of substorm dynamics [Klimas et al., 2000] and ground-level geomagnetic data [e.g., Wanliss, 2005; Pulkkinen et al., 2006; Balasis et al., 2009] exhibit “multifractal” power-law statistics, with α being a function of solar wind forcing. If we accept this, then in analyzing −Dstj data recording many years of geomagnetic activity, we are mixing data drawn from a quiet time statistical distribution with data drawn from a broad range of storm time power-law distributions. It is not evident that such an amalgamation could give a power-law distribution. And, indeed, the −Dstj data are evidently not consistent with a power-law process.




The tail of a lognormal distribution never resembles a power-law [e.g., Malevergne et al., 2011, equation 3]. Therefore, an extrapolation of a lognormal distribution for extreme-event probabilities will give results that are different from extrapolation of a power-law distribution [e.g., Clauset et al., 2009, Figure 5].
7 Extreme-Event Extrapolations
From fits of lognormal functions to bootstrap test samplings of the −Dstj data, we estimate median storm occurrence rates and confidence intervals; results are summarized in Table 2. For −Dstj ≥ 200 nT exceedance thresholds, maximum likelihood results have occurrence rates and tighter confidence intervals than corresponding least squares results. More specifically, storms with intensities exceeding that of the 1989 storm, −Dstj ≥ 589 nT, have a median maximum likelihood rate of 4.03 times per century and a 95% confidence interval of [2.01, 6.84] times per century. Corresponding least squares indicate that storms with intensities exceeding the 13–14 March 1989 storm occur with a median rate of 1.86 times per century, but with a (very wide) 95% confidence interval of [0.33,8.18] times per century. These results can be compared with those of Love [2012, equations (20) and (21)], who applied Bayesian analysis to a Poisson model to obtain an estimated occurrence rate and corresponding “credibility” intervals from counts of random events realized over a finite duration of time. Applying his formulas to a count of just one event in 56 years of time, like that for the largest −Dstj datum corresponding to the 1989 storm, we obtain an estimated occurrence rate of 1.79 times per century with an (approximate) 95% confidence interval of [0.00,7.14] times per century. We note that this interval encompasses both the maximum likelihood rate estimate, 4.03 times per century, and the least squares rate estimate, 1.86 times per century.
−Dstj | Median Rate | 95% Confidence Interval | ||
---|---|---|---|---|
Method | Example Storm | (nT) | (/100 years) | (/100 years) |
ML | 100 | 658.09 | [611.85, 706.15] | |
WLS | 675.51 | [603.72, 741.25] | ||
ML | 200 | 109.85 | [91.78, 129.53] | |
WLS | 108.03 | [75.61, 142.62] | ||
ML | 300 | 34.27 | [25.03, 44.41] | |
WLS | 28.39 | [13.73, 51.35] | ||
ML | 29–31 Oct 2003 | 383 | 16.26 | [10.56, 22.82] |
WLS | 11.09 | [3.94, 26.94] | ||
ML | 400 | 14.18 | [8.98, 20.24] | |
WLS | 9.30 | [3.11, 23.97] | ||
ML | 500 | 6.95 | [3.85, 10.88] | |
WLS | 3.71 | [0.89, 12.99] | ||
ML | 13–14 Mar 1989 | 589 | 4.03 | [2.01, 6.84] |
WLS | 1.86 | [0.33, 8.18] | ||
ML | 600 | 3.79 | [1.87, 6.50] | |
WLS | 1.72 | [0.29, 7.76] | ||
ML | 1–2 Sept 1859 | 850 | 1.13 | [0.42, 2.41] |
WLS | 0.36 | [0.03, 2.79] |
- a Values for both maximum likelihood (ML) and weighted least squares (WLS) are listed.
Extrapolated estimates of occurrence rates for even more intense magnetic storms are accompanied by even greater uncertainty. For storms having intensities exceeding that of the 1–2 September 1859 Carrington superstorm [Tsurutani et al., 2003; Siscoe et al., 2006; Cliver and Dietrich, 2013], −Dstj ≥ 850 nT, maximum likelihood results have a median occurrence rate of 1.13 times per century and a 95% confidence interval of [0.42,2.41] times per century. Corresponding least squares results indicate a median occurrence rate of 0.36 times per century with a 95% confidence interval of [0.03,2.79] times per century. An extrapolation from a power-law model of the −Dstj data gives an occurrence rate for a storm with an intensity exceeding the Carrington event at about 1.20 times per century [e.g., Riley, 2012]; this is just slightly higher than the median rates we estimate, and well within our estimated 95% confidence intervals.
From fits of lognormal functions to bootstrap test samplings of the −Dstj data, we can also estimate magnetic storm exceedance intensity given a prescribed occurrence rate; results are summarized in Table 3. For a once-per-decade or “10 year” events, maximum likelihood results indicate a 10 year median of −Dstj ≥ 447 nT and a 95% confidence interval of [389,515] nT. Corresponding least squares results indicate a 10 year median of −Dstj ≥ 393 nT with a 95% confidence interval of [320,549] nT. Once-per-century or “100 year” exceedance probabilities [e.g., Koons, 2001; Thomson et al., 2011; Pulkkinen et al., 2012] are often used to define space weather hazard mitigation standards [e.g., Fennell et al., 2001; Samuelsson, 2013; North American Electric Reliability Corporation, 2014]. Maximum likelihood bootstrap results indicate a 100 year median of −Dstj ≥ 880 nT with a 95% confidence interval of [697,1146] nT. Corresponding least squares results indicate a 100 year median exceedance of −Dstj ≥ 680 nT with a 95% confidence interval of [490,1187] nT.
Once per | −Dstj | 95% Confidence Intervals | |
---|---|---|---|
Method | (years) | (nT) | (nT) |
ML | 10 | 447 | [389, 515] |
WLS | 393 | [320, 549] | |
ML | 100 | 880 | [697, 1146] |
WLS | 680 | [490, 1187] |
- a Values for both maximum likelihood (ML) and weighted least squares (WLS) are listed.
8 Accuracy Versus Stability
In comparing the maximum likelihood and least squares lognormal fits of the −Dstj data, there is, evidently, a trade-off between the accuracy with which fits can be made to historical extreme-event data and the stability of the forecasts that might be made of future extreme events. The maximum likelihood method we use accurately fits most of the data, the preponderance of which record magnetic storms of weak intensity. The weighted least squares method gives an accurate fit of the extreme-event tail of the data distribution. Therefore, as a representation of the statistics of (past) extreme-event magnetic storms, the least squares fit might be preferred. On the other hand, the maximum likelihood method gives confidence intervals for estimated future storm occurrence rates and intensities that are tighter than those given by the least squares method. This difference can be attributed to the different weightings of the data made by the two different fitting methods. The maximum likelihood method fits all the data with equal weight, and this tends to stabilize the model estimates, while the weighted least squares method is required to fit bins of the data with weight relative to size, and this makes the fits sensitive to rare extreme-event statistics. From these modest observations we understand that to confidently forecast the future occurrence of extremely intense magnetic storms we need a model that can fit data recording past magnetic storms across a wide range of intensities. Still, with limited quantities of historical data, we predict that it will be a long time before we can substantially reduce the uncertainty of long-term forecasts of extremely intense magnetic storms.
Acknowledgments
We thank C.A. Finn, J. McCarthy, M.P. Moschetti, and J.L. Slate for reviewing a draft manuscript. We thank M.A. Balikhin and A. Kelbert for useful conversations. This work was supported by the USGS Geomagnetism Program. The standard Dst index is provided by the Kyoto World Data Center in Japan (wdc.kugi.kyoto-u.ac.jp).
The Editor thanks two anonymous reviewers for their assistance in evaluating this paper.