# On the lognormality of historical magnetic storm intensity statistics: Implications for extreme-event probabilities

## Abstract

An examination is made of the hypothesis that the statistics of magnetic storm maximum intensities are the realization of a lognormal stochastic process. Weighted least squares and maximum likelihood methods are used to fit lognormal functions to −*D**s**t* storm time maxima for years 1957–2012; bootstrap analysis is used to established confidence limits on forecasts. Both methods provide fits that are reasonably consistent with the data; both methods also provide fits that are superior to those that can be made with a power-law function. In general, the maximum likelihood method provides forecasts having tighter confidence intervals than those provided by weighted least squares. From extrapolation of maximum likelihood fits: a magnetic storm with intensity exceeding that of the 1859 Carrington event, −*D**s**t* ≥ 850 nT, occurs about 1.13 times per century and a wide 95% confidence interval of [0.42, 2.41] times per century; a 100 year magnetic storm is identified as having a −*D**s**t* ≥ 880 nT (greater than Carrington) but a wide 95% confidence interval of [490, 1187] nT.

## Key Points

- Storm occurrence might be a lognormal process
- Storm occurrence is not well modeled as a power-law process
- Confidence limits on forecasts remain wide due to few data

## 1 Introduction

A magnetic storm can be understood as the causal response of the Earth's coupled magnetospheric-ionospheric system to the variable and dynamic action of the solar wind [e.g., *Cowley*, 1995]. Scientific analysis of magnetic storms leads to improved fundamental understanding of the Earth's surrounding space weather environment [e.g., *Prölss*, 2004]. Applied analysis of magnetic storms enables assessment and mitigation of space weather-related hazards. Magnetic storms are associated with disruptions of over-the-horizon radio communication, degradation in the accuracy of global positioning systems, damage to satellite electronics and increased orbital drag, interference with geophysical surveys, and the induction of uncontrolled currents in electric power grids that sometimes cause blackouts [e.g., *Daglis*, 2004]. Of particular concern are rare but extremely intense magnetic storms [e.g., *Hapgood*, 2011], the effects of which [e.g., *Kappenman*, 2012; *Cannon et al.*, 2013] could have widespread deleterious economic impact [e.g., *Baker et al.*, 2008].

The space weather index *D**s**t* is a measure of the longitudinal average of low-latitude, ground-level magnetic disturbance. It is useful for identifying the evolutionary phases of individual magnetic storms [e.g., *McPherron*, 1995; *Loewe and Prölss*, 1997] and for compiling statistics on storm occurrence rate and intensity. For example, the arrival at Earth of a coronal mass ejection compresses the dayside magnetopause, intensifying its eastward directed current and generating a positive perturbation in *D**s**t*. With connection of the interplanetary magnetic field onto the geomagnetic field and solar wind driving of magnetospheric convection, energy can be loaded into the westward directed magnetotail current, which generates a negative magnetic perturbation in *D**s**t*. During a magnetic storm, electric currents can be induced in the ionosphere, and during a substorm, magnetospheric field-aligned currents flow into and out of the ionosphere; these currents generate magnetic field perturbations that are recorded in *D**s**t*. The main phase of a magnetic storm is defined by a global-scale decrease in low-latitude geomagnetic field intensity [e.g., *Gonzalez et al.*, 1994] that is generated by the westward directed magnetospheric ring current, the intensity of which is approximately proportional to the corresponding negative perturbation in *D**s**t*. A storm main phase can persist for a day or two, then, with diminution in solar wind forcing, the ring current intensity eventually dissipates, low-latitude magnetic disturbance decreases, and *D**s**t* returns to its near-zero prestorm baseline.

From analysis of historical *D**s**t* time series data (1957–2012), we quantify the occurrence rate of magnetic storms as a function of main phase, storm maximum intensity. A plausible statistical model is required to make extrapolated estimates of the occurrence rate of storms more intense and rarer than those observed since 1957. For this purpose, a lognormal statistical process is proposed as a hypothetical model of storm time *D**s**t* statistics. We examine the motivation for assuming a lognormal process, we test a lognormal function for consistency with the *D**s**t* data, and we compare and contrast the lognormal model with a power-law model. Results contribute to fundamental understanding of extreme space weather events [e.g., *Koons*, 2001; *Crosby*, 2011], and especially intense geomagnetic storms [e.g., *Tsubouchi and Omura*, 2007; *Thomson et al.*, 2011; *Cliver and Dietrich*, 2013]. This work is motivated by United States domestic agency [*National Science and Technology Council*, 2015] and international agency [e.g., *Schrijver et al.*, 2015] priorities and strategic plans for the assessment and mitigation of space weather hazards [e.g., *Love et al.*, 2014].

## 2 The *D**s**t* Time Series

The standard *D**s**t* index is provided by the Kyoto World Data Center in Japan. It is derived from hourly mean, horizontal intensity geomagnetic time series collected at four observatories [*Sugiura and Kamei*, 1991]: Hermanus (South African National Space Agency), Kakioka (Japan Meteorological Agency), Honolulu (U.S. Geological Survey), and San Juan (U.S. Geological Survey). A geomagnetic disturbance time series for each observatory is estimated by subtracting a nonstorm, quiet time baseline. The *D**s**t* time series is an average of the individual disturbance time series from the four observatories. We use “definitive” *D**s**t* values derived from calibrated magnetic observatory data. The Kyoto *D**s**t* index is “quantized” to the nearest 1 nT. Note that we have chosen not to use a version of *D**s**t* corrected for magnetopause currents generated by solar wind pressure [e.g., *Burton et al.*, 1975] nor have we corrected for tail current [e.g., *Turner et al.*, 2000]. While such corrections, and others that might be proposed, are interesting and important, our focus, here, is more strictly phenomenological; we are concerned with total magnetic storm intensity as measured by *D**s**t* regardless of the detailed cause of storm intensity.

In Figure 1 we show the *D**s**t* time series for the years 1957–2012, a period of time that encompasses six complete solar cycles: from the rise phase of solar cycle 19 through to the rise phase of cycle 24. Although this duration does not stretch from one sunspot minimum to another minimum, all of the solar cycle phases (rise, maximum, decline, and minimum) are represented approximately equally; thus, we have little bias in our amalgamation of *D**s**t* statistics across solar cycles. For 1957–2012, there are 490,896 separate hourly *D**s**t* values, and there are no gaps or obviously erroneous values. We subsample the *D**s**t* time series for storm maxima −*D**s**t* values (these particular *D**s**t* values are negative): paying attention to the identification of both intense magnetic storms (which are rare) and weak magnetic storms (which are numerous), we use a simple algorithm that ranks and winnows the *D**s**t* data, counting all storm maxima −*D**s**t* values greater than or equal to 63 nT, a convenience corresponding to a bin boundary from among the 10 that cover a decade of range in −*D**s**t*. Storms with maxima less than this threshold can be difficult to distinguish from intense magnetic storms having multistep evolution or from occasional periods of general magnetic disturbance that might precede or follow a storm.

We note some specifics. Severe geomagnetic activity was realized over about one and a half days during the 13–14 March 1989 storm [e.g., *Allen et al.*, 1989], but in compiling storm maxima statistics, this storm is counted only once with a maximum −*D**s**t* value of 589 nT; similarly, the 15–20 July 1959 storm is counted only once with a maximum −*D**s**t* of 429 nT, etc. For the years 1957–2012, we identify 1051 separate storms having maximum −*D**s**t*≥63 nT; of these, 68 storms have a maximum −*D**s**t* value exceeding 200 nT, 21 exceed 300 nT, and only 5 exceed 400 nT. For convenience, we denote these storm maximum values as −*D**s**t*_{j}, with each value corresponding to the universal time hour *t*_{j} of a storm's maximum intensity. A complete list of the storm maximum values is given in the supporting information for this article.

## 3 Lognormal Statistics

*x*is

*μ*and

*σ*

^{2}correspond to the

*x*population mean and variance;

*μ*and

*σ*

^{2}are model parameters and will be retained as such throughout this analysis. Physical processes that result from the additive superposition of multiple underlying effects are often recorded in statistical data that are approximately normally distributed. This relative ubiquity has foundation in one of the most important theorems of statistics: the central limit theorem, which, in its general form, states that the arithmetic average of

*n*-independent random variables, each drawn from an arbitrary distribution having a mean and variance, will be normally distributed in the limit as

*n*approaches infinity [e.g.,

*Feller*, 1966, chapter VIII.4;

*Stuart and Ord*, 1994, chapter 8.47].

*x*is lognormally distributed if the logarithm of

*x*is normally distributed,

*x*denotes the natural logarithm of

*x*and

*μ*and

*σ*

^{2}are now the ln

*x*-population mean and variance, which can still be regarded as model parameters. A change of variables for must conserve differential probability. In particular,

*x*and drawn from a lognormal process is given by the cumulative

*Gradshteyn and Ryzhik*, 1980, equation 3.321.2;

*Crow and Shimizu*, 1988, p. 114], where erfc is the complementary error function. It is important to recognize that as for the normal distribution, a central limit theorem applies for the lognormal distribution: the multiplication of

*n*-independent random variables, each of which is positive and drawn from an arbitrary distribution having a mean and variance, will be lognormally distributed in the limit as

*n*approaches infinity [e.g.,

*Aitchison and Brown*, 1957, chapter 2.6;

*Crow and Shimizu*, 1988, p. 5;

*Ross*, 2014, p. 244].

We examine the hypothesis that the −*D**s**t*_{j} data can be treated as a realization of a lognormal stochastic process [e.g., *Pulkkinen et al.*, 2008, paragraph 29]. Roughly speaking, multiplicative processes operating in three different domains affect the statistics of magnetic storms. First, the solar cycle results from autoregressive dynamo action, an ∼11 year quasiperiodic amplification and deamplification of poloidal solar magnetic energy [e.g., *Solanki et al.*, 2000] that is manifest as a modulation of sunspot number, coronal mass ejections, and geomagnetic storm occurrence rate. As an aside, we note that statistical models for periodically modulated phenomena can be parameterized in terms of an integrated average rate of occurrence [e.g., *Cox and Lewis*, 1966, chapter 2.2.v]. Second, the geoeffectiveness of solar wind-magnetosphere coupling is often calculated from multiplicative combinations of solar wind velocity, density, and interplanetary magnetic field direction and intensity [e.g., *Newell et al.*, 2007]. While these variables are not, themselves, statistically independent, satellite data acquired over multiple solar cycles show that solar wind density and dynamic pressure are approximately lognormally distributed [e.g., *Veselovsky et al.*, 2010]. Third, the dynamic evolution of a magnetic storm can be modeled as an autoregressive filter of solar wind forcing and an integration of previous recent states of the magnetospheric-ionospheric system [e.g., *Vassiliadis et al.*, 1999]. Considering, then, all these physical processes together, the multiplicative central limit theorem could be of relevance. We emphasize, however, that optimism should be tempered with recognition that the three physical domains do not act independently to determine magnetic storm intensity. In the end, any practical assessment of the success or failure of the lognormal hypothesis will be largely dependent on its consistency with the −*D**s**t*_{j} data.

## 4 Fitting Models to Data

*D*

*s*

*t*

_{j}occurrence rate, we use a truncated lognormal function

*m*that is proportional to the density function 4,

*A*is a normalizing amplitude,

*N*is the number of storms such that −

*D*

*s*

*t*

_{j}≥ 63 nT. We estimate model parameters {

*A*,

*μ*,

*σ*

^{2}} by fitting equation 6 to the −

*D*

*s*

*t*

_{j}data using two different standard methods. The two resulting fits have slightly different properties, which we compare and contrast.

*Coles*, 2001, chapter 2.6.3]. Assuming that the data are samples from a lognormal distribution, we define an objective reward function from their joint probability,

*j*is over all the −

*D*

*s*

*t*

_{j}data values; model parameters are obtained by maximizing

*L*. Second, we fit the data using a weighted least squares (WLS) method. We count the number of −

*D*

*s*

*t*

_{j}data in each of a discrete set of bin intervals having widths measured in nanotesla; we divide the counts in each by the bin width and by the 56 year duration of the data time series; this gives us binned densities

*b*

_{i}measured in counts/yr/nT. As an objective loss function, we use the

*χ*

^{2}sum of the squares of “relative errors” defined by the residual discrepancy between each bin and model density,

*b*

_{i}−

*m*(−

*D*

*s*

*t*

_{i}), each weighted by dividing by the bin density

*b*

_{i},

*Castillo*, 1988, chapter 4.6.2;

*Hansen et al.*, 2013, chapter 1.3], where summation in

*i*is over the bins

*b*

_{i}; model parameters are obtained by minimizing

*χ*

^{2}. We remark that an unweighted least squares method would give a fit that is similar to that which can be obtained with the maximum likelihood method, assuming normally distributed residuals [e.g.,

*Riley*, 2012;

*Hansen et al.*, 2013; chapter 1.3].

We use a bootstrap method [e.g., *Efron and Tibshirani*, 1993] to estimate the errors of extrapolated storm occurrence rates and intensities. For this, the (measured, “mother”) −*D**s**t*_{j} data are treated as a population distribution from which samples are drawn. We randomly sample with replacement from the −*D**s**t*_{j} data, assembling numerous empirical data sets, each having size equal to that of the mother data set. We assume, as is standard, that each empirical data set is a plausible sample of magnetic storm intensities that could have been realized in the past and which might be realized in the future. We fit each empirical data set with a separate lognormal function using either of the described maximum likelihood or weighted least squares methods. From the set of bootstrap empirical fits, then, we estimate confidence intervals on predicted occurrence rates and intensities.

## 5 Results

In Figure 2a we show histograms (black) of binned densities for storm maxima −*D**s**t*_{j} ≥ 63 nT values from years 1957–2012, and in Figure 2b we show the corresponding exceedance cumulatives (black). We also show lognormal model functions fitted by maximum likelihood (blue) and weighted least squares (red) methods; model parameters are listed in Table 1. Before we continue, we remark that these model parameters can be used to make extrapolated estimates of storm occurrence rates and intensities, but the accuracy of such extrapolations is poor for storms that are far out in the extreme-event tail of the distribution, having intensities substantially greater than the measured maximum of −*D**s**t*_{j} = 589 nT; in section we report confidence intervals. Similarly, since the expected value *E*(*x*) is far below the cutoff of 63 nT, the parameters *μ* and *σ* are sensitive functions of the estimation method, and detailed interpretation of their values should be made with caution.

*D*

*s*

*t*

_{j}Data, Together With Kolmogorov-Smirnov and Weighted

*χ*

^{2}Significance

*p*Valuesa

A |
E |
S |
|||
---|---|---|---|---|---|

Method | (Number/yr) | (nT) | (nT) | p(KS) |
p(χ^{2}) |

ML | 3168.58 | 5.42 | 11.79 | 0.0201 | 0.3504 |

WLS | 82.99 | 45.64 | 44.49 | 0.0000 | 0.9999 |

^{a}To convert to the parameterization used in this analysis, and .

Judged visually, fitted lognormal model functions seem to provide reasonably good representations of the binned −*D**s**t*_{j} data. In more detail, however, the maximum likelihood fit is very good for the lowest energy bin, 63–79 nT, while the least squares fit is not so good for that bin. On the other hand, the maximum likelihood fit is not so good for the highest energy bin, 501–631 nT, for which there is only one count, while the least squares fit is very good for that bin. The *χ*^{2} measure of relative discrepancy [*Press et al.*, 1992, chapter 14.3] between the maximum likelihood fit and the binned data, equation 9, would be a moderately likely statistical realization of random data, the significance probability *p* = 0.3504; the least squares fit would be a very likely statistical realization of random data,
. On this basis, we cannot confidently reject the null hypothesis that the data are realized from a lognormal process. Similar observations pertain for comparisons of the model fits with the data cumulatives. The Kolmogorov-Smirnov *D* measure [*Press et al.*, 1992, chapter 14.3] of the greatest absolute discrepancy between the maximum likelihood fit and the data cumulatives would not be an especially likely statistical realization of random data, the significance probability *p* = 0.0201, but it would be very unlikely for the least squares fit,
. In contrast to the evaluation of the binned data, in judging the cumulatives we might be tempted to reject the null hypothesis that the data are realized from a lognormal process.

It is worth emphasizing that the preceding formal and somewhat inconsistent assessments of significance should be regarded with a practical awareness of the considerable variability in −*D**s**t*_{j} that accompanies the solar cycle waxing and waning of sunspots and the variable geoeffectiveness of the solar wind. In Figures 2a and 2b we show, respectively, the data densities and cumulative for each of the six individual solar cycles (gray). Among these, the rise of cycle 19 to the rise of cycle 20 (1957–1966) corresponds to a high number of sunspots (solar maximum: 201.3), while the rise of 20 to rise of 21 (1967–1977) corresponds to a low number of sunspots (solar maximum: 110.6); the present solar cycle (24) is not complete at the time of this writing. The solar cycle subsets show substantial dispersion about the densities and cumulatives that are based on all the data (black). Indeed, some of the misfit between the data cumulative and the lognormal model fits, for example, for bin 251–316 nT, might be attributed to magnetic storms realized during a single solar cycle. More generally, given the dispersion seen in the data for individual solar cycles, the lognormal fits obtained by either maximum likelihood (blue) or least squares (red) methods might be considered rather satisfactory representations of −*D**s**t*_{j}.

## 6 Power-Law Statistics

*α*, the scaling exponent, is sometimes called the “fractal dimension,” and

*C*is a normalization constant that is defined for

*x*larger than a threshold [e.g.,

*Newman*, 2005]. Power-law distributions have been shown to describe the occurrence statistics of many types of natural events, including earthquakes, landslides, floods, volcanic eruptions, and solar flares [e.g.,

*Crosby*, 2011;

*Sachs et al.*, 2012]. Several attempts have been made to model storm time geomagnetic disturbance data with power-law distributions [e.g.,

*Love and Gannon*, 2009;

*Riley*, 2012;

*Kataoka*, 2013;

*Yermolaev et al.*, 2013]. In terms of the −

*D*

*s*

*t*

_{j}disturbance data shown in the log-log plot of Figure 2, a power-law function appears as a straight line (green for both density and cumulative), yet we see in Figure 2 that the occurrence rate of the −

*D*

*s*

*t*

_{j}data does not follow a straight line; for both the

*χ*

^{2}and Kolmogorov-Smirnov

*D*tests, power-law fits for −

*D*

*s*

*t*

_{j}≥ 63 nT have . Indeed, a power-law can only give a reasonable fit for a small range of −

*D*

*s*

*t*

_{j}values; on the other hand, the lognormal distribution is a superior representation over a wide range of −

*D*

*s*

*t*

_{j}values. While there is certainly room for improvement upon the lognormal functional fit to the −

*D*

*s*

*t*

_{j}data, we can confidently favor the lognormal function over the power-law function.

Nonetheless, we might still be curious as to why the −*D**s**t*_{j} data are not well represented by a power-law function. Some physical systems that exhibit power-law statistics can be described in terms of self-organizing criticality (SOC) [e.g., *Turcotte*, 1999; *Aschwanden*, 2011]. In particular, the detailed time evolution of individual magnetic substorms caused by collapses of the magnetotail has sometimes been described in terms of SOC dynamics [*Angelopoulos et al.*, 1999; *Chang*, 1999]. Both numerical simulations of substorm dynamics [*Klimas et al.*, 2000] and ground-level geomagnetic data [e.g., *Wanliss*, 2005; *Pulkkinen et al.*, 2006; *Balasis et al.*, 2009] exhibit “multifractal” power-law statistics, with *α* being a function of solar wind forcing. If we accept this, then in analyzing −*D**s**t*_{j} data recording many years of geomagnetic activity, we are mixing data drawn from a quiet time statistical distribution with data drawn from a broad range of storm time power-law distributions. It is not evident that such an amalgamation could give a power-law distribution. And, indeed, the −*D**s**t*_{j} data are evidently not consistent with a power-law process.

*r*,

*x*gives a scaling of the function , including in the limit . In contrast, the corresponding extreme-value limit [e.g.,

*Beirlant et al.*, 2004, chapter 2.9.2] for the lognormal distribution is not scale-invariant,

The tail of a lognormal distribution never resembles a power-law [e.g., *Malevergne et al.*, 2011, equation 3]. Therefore, an extrapolation of a lognormal distribution for extreme-event probabilities will give results that are different from extrapolation of a power-law distribution [e.g., *Clauset et al.*, 2009, Figure 5].

## 7 Extreme-Event Extrapolations

From fits of lognormal functions to bootstrap test samplings of the −*D**s**t*_{j} data, we estimate median storm occurrence rates and confidence intervals; results are summarized in Table 2. For −*D**s**t*_{j} ≥ 200 nT exceedance thresholds, maximum likelihood results have occurrence rates and tighter confidence intervals than corresponding least squares results. More specifically, storms with intensities exceeding that of the 1989 storm, −*D**s**t*_{j} ≥ 589 nT, have a median maximum likelihood rate of 4.03 times per century and a 95% confidence interval of [2.01, 6.84] times per century. Corresponding least squares indicate that storms with intensities exceeding the 13–14 March 1989 storm occur with a median rate of 1.86 times per century, but with a (very wide) 95% confidence interval of [0.33,8.18] times per century. These results can be compared with those of *Love* [2012, equations (20) and (21)], who applied Bayesian analysis to a Poisson model to obtain an estimated occurrence rate and corresponding “credibility” intervals from counts of random events realized over a finite duration of time. Applying his formulas to a count of just one event in 56 years of time, like that for the largest −*D**s**t*_{j} datum corresponding to the 1989 storm, we obtain an estimated occurrence rate of 1.79 times per century with an (approximate) 95% confidence interval of [0.00,7.14] times per century. We note that this interval encompasses both the maximum likelihood rate estimate, 4.03 times per century, and the least squares rate estimate, 1.86 times per century.

*D*

*s*

*t*

_{j}a

−Dst_{j} |
Median Rate | 95% Confidence Interval | ||
---|---|---|---|---|

Method | Example Storm | (nT) | (/100 years) | (/100 years) |

ML | 100 | 658.09 | [611.85, 706.15] | |

WLS | 675.51 | [603.72, 741.25] | ||

ML | 200 | 109.85 | [91.78, 129.53] | |

WLS | 108.03 | [75.61, 142.62] | ||

ML | 300 | 34.27 | [25.03, 44.41] | |

WLS | 28.39 | [13.73, 51.35] | ||

ML | 29–31 Oct 2003 | 383 | 16.26 | [10.56, 22.82] |

WLS | 11.09 | [3.94, 26.94] | ||

ML | 400 | 14.18 | [8.98, 20.24] | |

WLS | 9.30 | [3.11, 23.97] | ||

ML | 500 | 6.95 | [3.85, 10.88] | |

WLS | 3.71 | [0.89, 12.99] | ||

ML | 13–14 Mar 1989 | 589 | 4.03 | [2.01, 6.84] |

WLS | 1.86 | [0.33, 8.18] | ||

ML | 600 | 3.79 | [1.87, 6.50] | |

WLS | 1.72 | [0.29, 7.76] | ||

ML | 1–2 Sept 1859 | 850 | 1.13 | [0.42, 2.41] |

WLS | 0.36 | [0.03, 2.79] |

^{a}Values for both maximum likelihood (ML) and weighted least squares (WLS) are listed.

Extrapolated estimates of occurrence rates for even more intense magnetic storms are accompanied by even greater uncertainty. For storms having intensities exceeding that of the 1–2 September 1859 Carrington superstorm [*Tsurutani et al.*, 2003; *Siscoe et al.*, 2006; *Cliver and Dietrich*, 2013], −*D**s**t*_{j} ≥ 850 nT, maximum likelihood results have a median occurrence rate of 1.13 times per century and a 95% confidence interval of [0.42,2.41] times per century. Corresponding least squares results indicate a median occurrence rate of 0.36 times per century with a 95% confidence interval of [0.03,2.79] times per century. An extrapolation from a power-law model of the −*D**s**t*_{j} data gives an occurrence rate for a storm with an intensity exceeding the Carrington event at about 1.20 times per century [e.g., *Riley*, 2012]; this is just slightly higher than the median rates we estimate, and well within our estimated 95% confidence intervals.

From fits of lognormal functions to bootstrap test samplings of the −*D**s**t*_{j} data, we can also estimate magnetic storm exceedance intensity given a prescribed occurrence rate; results are summarized in Table 3. For a once-per-decade or “10 year” events, maximum likelihood results indicate a 10 year median of −*D**s**t*_{j} ≥ 447 nT and a 95% confidence interval of [389,515] nT. Corresponding least squares results indicate a 10 year median of −*D**s**t*_{j} ≥ 393 nT with a 95% confidence interval of [320,549] nT. Once-per-century or “100 year” exceedance probabilities [e.g., *Koons*, 2001; *Thomson et al.*, 2011; *Pulkkinen et al.*, 2012] are often used to define space weather hazard mitigation standards [e.g., *Fennell et al.*, 2001; *Samuelsson*, 2013; *North American Electric Reliability Corporation*, 2014]. Maximum likelihood bootstrap results indicate a 100 year median of −*D**s**t*_{j} ≥ 880 nT with a 95% confidence interval of [697,1146] nT. Corresponding least squares results indicate a 100 year median exceedance of −*D**s**t*_{j} ≥ 680 nT with a 95% confidence interval of [490,1187] nT.

*D*

*s*

*t*

_{j}, and 95 Confidence Intervals Given Different Storm Occurrence Ratesa

Once per | −Dst_{j} |
95% Confidence Intervals | |
---|---|---|---|

Method | (years) | (nT) | (nT) |

ML | 10 | 447 | [389, 515] |

WLS | 393 | [320, 549] | |

ML | 100 | 880 | [697, 1146] |

WLS | 680 | [490, 1187] |

^{a}Values for both maximum likelihood (ML) and weighted least squares (WLS) are listed.

## 8 Accuracy Versus Stability

In comparing the maximum likelihood and least squares lognormal fits of the −*D**s**t*_{j} data, there is, evidently, a trade-off between the accuracy with which fits can be made to historical extreme-event data and the stability of the forecasts that might be made of future extreme events. The maximum likelihood method we use accurately fits most of the data, the preponderance of which record magnetic storms of weak intensity. The weighted least squares method gives an accurate fit of the extreme-event tail of the data distribution. Therefore, as a representation of the statistics of (past) extreme-event magnetic storms, the least squares fit might be preferred. On the other hand, the maximum likelihood method gives confidence intervals for estimated future storm occurrence rates and intensities that are tighter than those given by the least squares method. This difference can be attributed to the different weightings of the data made by the two different fitting methods. The maximum likelihood method fits all the data with equal weight, and this tends to stabilize the model estimates, while the weighted least squares method is required to fit bins of the data with weight relative to size, and this makes the fits sensitive to rare extreme-event statistics. From these modest observations we understand that to confidently forecast the future occurrence of extremely intense magnetic storms we need a model that can fit data recording past magnetic storms across a wide range of intensities. Still, with limited quantities of historical data, we predict that it will be a long time before we can substantially reduce the uncertainty of long-term forecasts of extremely intense magnetic storms.

## Acknowledgments

We thank C.A. Finn, J. McCarthy, M.P. Moschetti, and J.L. Slate for reviewing a draft manuscript. We thank M.A. Balikhin and A. Kelbert for useful conversations. This work was supported by the USGS Geomagnetism Program. The standard *D**s**t* index is provided by the Kyoto World Data Center in Japan (wdc.kugi.kyoto-u.ac.jp).

The Editor thanks two anonymous reviewers for their assistance in evaluating this paper.