Volume 17, Issue 1 p. 180-209
Research Article
Free Access

The Development of a Space Climatology: 3. Models of the Evolution of Distributions of Space Weather Variables With Timescale

Mike Lockwood

Corresponding Author

Mike Lockwood

Department of Meteorology, University of Reading, Reading, UK

Correspondence to: M. Lockwood,

[email protected]

Search for more papers by this author
Sarah N. Bentley

Sarah N. Bentley

Department of Meteorology, University of Reading, Reading, UK

Search for more papers by this author
Mathew J. Owens

Mathew J. Owens

Department of Meteorology, University of Reading, Reading, UK

Search for more papers by this author
Luke A. Barnard

Luke A. Barnard

Department of Meteorology, University of Reading, Reading, UK

Search for more papers by this author
Chris J. Scott

Chris J. Scott

Department of Meteorology, University of Reading, Reading, UK

Search for more papers by this author
Clare E. Watt

Clare E. Watt

Department of Meteorology, University of Reading, Reading, UK

Search for more papers by this author
Oliver Allanson

Oliver Allanson

Department of Meteorology, University of Reading, Reading, UK

Search for more papers by this author
Mervyn P. Freeman

Mervyn P. Freeman

British Antarctic Survey, Cambridge, UK

Search for more papers by this author
First published: 07 December 2018
Citations: 16
This article is a companion to Lockwood et al. (2018), https://doi.org/10.1029/2018SW001856 and Lockwood et al. (2018), https://doi.org/10.1029/2018SW002016.

Abstract

We study how the probability distribution functions of power input to the magnetosphere Pα and of the geomagnetic ap and Dst indices vary with averaging timescale, τ, between 3 hr and 1 year. From this we develop and present algorithms to empirically model the distributions for a given τ and a given annual mean value. We show that lognormal distributions work well for ap, but because of the spread of Dst for low activity conditions, the optimum formulation for Dst leads to distributions better described by something like the Weibull formulation. Annual means can be estimated using telescope observations of sunspots and modeling, and so this allows the distributions to be estimated at any given τ between 3 hr and 1 year for any of the past 400 years, which is another important step toward a useful space weather climatology. The algorithms apply to the core of the distributions and can be used to predict the occurrence rate of large events (in the top 5% of activity levels): they may contain some, albeit limited, information relevant to characterizing the much rarer superstorm events with extreme value statistics. The algorithm for the Dst index is the more complex one because, unlike ap, Dst can take on either sign and future improvements to it are suggested.

Key Points

  • Core distributions and extreme events of geomagnetic activity are studied as a function of averaging timescale τ
  • The autocorrelation is shown to have a dominant role determining how these core distributions vary with averaging timescale τ
  • Models for computing the distribution of geomagnetic activity for a given timescale τ and annual mean are presented

Plain Language Summary

This is the third in a series of three papers aimed at developing a climatology of space weather that applies to all solar conditions between grand solar minimum and grand solar maximum. We generate empirical models to enable us to predict the probability of a given level of space weather disturbance, as quantified by either the ap of the Dst geomagnetic indices, in a year with a given average level of disturbance. The models can be used with averaging/integration times anywhere between 3 hr and 1 year.

1 Introduction

This paper is the third of a series of three that is aimed at putting in place some of the key elements that will be needed to build a space weather climatology that covers both grand solar maximum and grand solar minimum conditions. As discussed in the introductions to Papers 1 and 2 (Lockwood et al., 2018a, 2018b), information on space climate over an interval long enough to cover both a grand solar minimum and a grand solar maximum (of order 400 years) is available only in the form of modeled annual means of some key variables (Owens et al., 2017). Hence, developing a climatology giving the probability of space weather events of a given geoeffectiveness that covers both these extremes of the long-term solar variation requires us to develop an understanding of relationships between these annual means and the distributions of event amplitudes, quantified over the relevant timescales. Because space weather events come in bursts, the integrated value of any activity index X over the most relevant timescale τ, IX, is a useful metric (Borovsky, 2017; Echer et al., 2008; Lockwood et al., 2016; Tindale et al., 2018), and this equals the arithmetic mean value times τ (i.e., IX = ∫τXdt = τ <X>τ). Hence, it is important to study how <X>τ varies with τ and how it relates to the annual arithmetic mean value <X>τ=1year. Lockwood, Owens, et al. (2018) have demonstrated how annual means can be used to quantify the frequency of geomagnetic disturbance events above a given (large but not extreme) threshold for the past 400 years, but they studied only hourly and daily means (τ = 1 hr and τ = 1 day) which, in general, will not be the most relevant timescales for all space weather phenomenon. For example, Lockwood et al. (2016) recently studied that the interplanetary conditions leading to large geomagnetic storms as detected in the Dst index and found τ ≈ 6 hr (with a 2σ uncertainty range of 4–12 hr) were optimum for predicting the maximum of the storm (i.e., the minimum Dst), but τ ≈ 4.5 days was needed to best predict the integrated Dst over the duration of the storm. Paper 1 (Lockwood et al., 2018a) studied energy coupling from the solar wind into the magnetosphere and showed that neglecting the effects of gaps in interplanetary data has, in the past, introduced serious errors into derived solar wind-magnetosphere coupling functions. Paper 1 also used near-continuous data to show that there is no evidence that the coupling function varies with averaging timescale τ between 1 min and 1 year. Paper 2 (Lockwood et al., 2018b) used this result to study the distribution of power input into the magnetosphere Pα (Vasyliunas et al., 1982) and why the probability density function (pdf) of <Pα >τ (i.e., Pα averaged over intervals of duration τ) has the form it does at τ = 1 min. Paper 2 also showed how this pdf evolves with increasing τ up to 3 hr, giving the observed pdfs of 3-hourly geomagnetic indices. In the present paper, we study how the distributions of power input into the magnetosphere, and of the geomagnetic indices, continue to evolve with increasing τ between 3 hr and 1 year, allowing us to study the relationships of the pdf at any relevant τ to the annual mean. These are key relationships that can make it possible to construct a climatology of space weather events based on observations of solar variability over the past 400 years.

1.1 Core Distributions of Space Weather Variables and Extreme Events

In this section, we make clear the distinctions between the core distribution of space weather events, large events (e.g., Lockwood et al., 2017; Lockwood, Owens, et al., 2018, studied events in the top 5%) and extreme events. Our aim is to investigate how much information on the extreme events could potentially be gleaned from the annual means and the core distribution. We use the 3-hourly ap planetary geomagnetic range index that is available continuously since 1932. This index is used because of the longevity of the data series and because it is more robust than the aa index in that it employs more than just two observatories. Appendix B shows that the ap index has a marked tendency to exaggerate the semi-annual variation in average values by having a larger response to events occurring at the equinoxes and also has a lower response to large events during Northern Hemisphere winter. We here use a version of ap, apC, which includes a correction for the effect of this uneven response in ap, as described in Appendix B. To compare to any events before 1932, we use the aa geomagnetic index, using intercalibration curves that are also presented in Appendix B.

Allen (1982) pointed out that averages of ap over a calendar day (by convention referred to as Ap = <ap>τ=1day) are not appropriate for defining storm days because an isolated storm that spans 24 hrs UT would be recorded as two moderately disturbed days rather than a single large storm day. Hence, Allen proposed using 24-hr boxcar (running) means of ap, which he termed Ap*. These have been employed by Kappenman (2005) and Cliver and Svalgaard (2004). For the purposes of identifying and ranking storm days we take the largest value of the eight such running means of the corrected ap index in each calendar day, [ApC*]MAX. A rank-order listing of the largest events defined this way is given in the supporting information, along with available references.

Many papers have found variables of near-Earth interplanetary space and the magnetosphere approximately follow a lognormal (or similar) distribution for the great majority of the time (Dmitriev et al., 2009; Farrugia et al., 2012; Hapgood et al., 1991; Lockwood & Wild, 1993; Lotz & Danskin, 2017; Love et al., 2015; Riley & Love, 2017; Vaselovsky et al., 2010; Vörös et al., 2015; Weigel & Baker, 2003; Xiang & Qu, 2018). This mathematical formulation describes the core of the distribution but often fails to match the occurrence of very large or extreme events (e.g., Baker et al., 2013; Cliver & Dietrich, 2013; Lotz & Danskin, 2017; Riley, 2012). Hence, such cases are often described by substituting a distribution to the large-event tail that is different to that which fits the core of the distribution. Extreme value statistics (EVS, e.g., Beirlant et al., 2004; Coles, 2004; Kotz & Nadarajah, 2000) has been widely applied, initially in studies of hydrology but subsequently to extreme terrestrial weather events and many other areas such as in engineering, insurance, and finance. The extremal types theorem (also called the Fisher-Tippett-Gnedenko theorem, Coles, 2004), states that extreme maxima follow one of three types of distribution (Gumbel, Fréchet, and [negative] Weibull), which are encapsulated in a family of continuous probability distributions called the generalized extreme value distribution. In the block maxima approach to extreme values, the observation period is divided into nonoverlapping periods of equal size and attention given to the maximum observation in each period to which the GEV distribution applies. In the peaks-over-threshold approach, observations that exceed a certain high threshold are selected. The second theorem in extreme value theory is the Pickands-Balkema-de Haan theorem and states that the threshold excesses have an approximate distribution within the generalized Pareto distribution family. EVS has been applied to geomagnetic indices (e.g., Chapman et al., 2018; Mourenas et al., 2018; Siscoe, 1976; Silbergleit, 1996, 1999; Tsubouchi & Omura, 2007), to the occurrence of very large geomagnetically induced currents (GICs, Lotz & Danskin, 2017; Thomson et al., 2011), and to the fluxes of energetic magnetospheric particles (Koons, 2001; O'Brien et al., 2007).

Figure 1 places into context the relationship of the extreme event tail to the core distribution for geomagnetic activity as measured by the (corrected) ap index, apC. The plot shows (top) some selected annual distributions of the ApC* index and (bottom) the corresponding distributions of ApC* as ratio of the annual mean value, ApC*/<apC>τ=1year. The gray histograms are for all available ApC* data (i.e., covering the years 1932–2016). Note that we here quote ap, and hence apC, ApC* and [ApC*]MAX, as indices without units (the standard ap values are an index in units of 2 nT and hence the values in nanotesla would be double those given here Menvielle & Berthelier, 1991). The black vertical dashed line shows Apo, the 95th percentile of all available samples. The year 1960 (shown in red) was 1 year after the maximum of the largest sunspot cycle (number 9) of the recent grand solar maximum (Lockwood et al., 2009) and gave the largest annual mean value since ap measurements began (<apC>τ=1year = 23.65) and also contained the largest observed event since 1932, as determined by a daily [ApC*]MAX value of 249 on 13 November of that year. The year 2009 (in blue) that was at the low sunspot minimum (between cycles 23 and 24) gave the smallest annual mean in the record (<apC>τ=1year = 3.93). The year 1859 (in orange) has been chosen because between 28 August and 5 September of that year, the Carrington event took place (see contemporary reports by E. Loomis, collected together by Shea & Smart, 2006), which is thought to be the largest terrestrial space weather event to have been observed as it happened (Cliver & Dietrich, 2013; Nevanlinna, 2006; Ngwira et al., 2014). The mean <ap>τ=1year for 1859 has been estimated to have been 10.98 by Lockwood, Owens, et al. (2018). The distribution of daily Ap occurrence for 1859 shown in Figure 1 has been generated from the estimated mean value for that year using a model that will be developed in the present paper and is described in Appendix A. The distribution for 2012 is included (in green, <apC>τ=1year = 9.20) because on 23 July of that year a very large and very rapid coronal mass ejection erupted, an event which would have generated extreme terrestrial space weather (a superstorm) had it hit the Earth. It was observed as it passed over the STEREO-A spacecraft and, from modeling based on the measurements taken by that craft and by solar instruments, it is estimated it would have caused a terrestrial event as large as the Carrington event, had the eruption taken place just 1 week earlier such that the coronal mass ejection would have hit Earth's magnetosphere instead of STEREO-A (Baker et al., 2013; Ngwira et al., 2013). From available magnetometer data, Nevanlinna (2006) has estimated that the daily aa geomagnetic index reached Aa = 400 nT during the Carrington event. This estimate allows for missing data but may still be an underestimate, and Cliver and Svalgaard (2004) estimated the peak value of the running mean of a corrected version of the aa index over 24 hr of Aa* to be 425 nT. The aa index was designed by Mayaud (1972, 1980) to act as an equivalent to the ap index using data from just two stations: however, the data since 1932 show that the two are not linearly related, with ap at large aa being significantly lower than would be obtained from a linear fit. Polynomial fits of daily means, Ap, as a function of the daily means in aa (by convention termed Aa) are given in Appendix B for the four quarter-year intervals around the equinoxes and solstices. Taking the peak Aa to be 425 nT for the Carrington event, the relevant equation B3 gives an estimated maximum Ap* value of 284 ± 30. Because it is considered that the STEREO event would have given a storm comparable to the Carrington event, we here take this Ap* to apply to it as well. These values of [Ap*]MAX of 284 are shown by the vertical dash-dotted lines. Applying the time-of-year correction given in Appendix B, this yields [ApC*]MAX of 215 ± 23 and 211 ± 23, respectively, for these two events. These estimated [ApC*]MAX values for the Carrington and STEREO events are shown in Figure 1a by, respectively, the solid vertical orange and green vertical lines. By way of comparison, the largest daily mean in the observed [ApC*]MAX record (since 1932) is 249, recorded on 13 November 1960.

Details are in the caption following the image
Distributions of apC, the ap index corrected for the annual variation in its response function, (see Appendix B). Annual distributions of (top) eight-point running (boxcar) means of the 3-hourly apC values, ApC*, and (bottom) of those means as a ratio of the annual mean value for the calendar year in queetion, ApC*/<apC>τ=1year, for (red) 1960; (blue) 2009; (green) 2012; and (orange) modeled for 1859. The gray histograms in the background are the distributions for all 248368 ApC* values available from the interval 1932–2016. The vertical orange lines mark the estimated value for the peak of the 1859 Carrington event: The solid orange line is estimate 1, [ApC*]MAX, which makes allowance for the time-of-year response of the ap index (also marked by an orange triangle), the dash-dotted orange line is [Ap*]MAX, which does not make this correction (estimate 2, also marked by an orange square). The uncertainty bars arise only from the conversion of Aa* to Ap* and do not include the uncertainty in the Aa* estimate. The distributions for 2012 are shown because in that year an event, which is estimated would have caused an extreme event almost as large as the Carrington event, passed over the STEREO A craft but missed the Earth: The vertical green lines show the estimated maximum for that event, had it hit Earth: the solid green line and green triangle are for the [ApC*]MAX (estimate 1) value, and the dash-dotted green line and green square are for the [Ap*]MAX (estimate 2) value. The vertical colored dashed lines give the 95th percentile of the annual distributions, using the same color scheme, and the vertical black dashed lines are the equivalent for Apo, the 95th percentile of all ApC* values. The short vertical cyan lines show the top 100 (0.32%) of the maximum ApC* values in a calendar day, [ApC*]MAX, and the short vertical mauve lines [ApC*]MAX values are the 6 days in the top 0.02%. The top 100 events, with further details, are listed in Part 3 of the supporting information.

The list of storm days, since 1932 ranked by their [ApC*]MAX values, is given in the supporting information. It has similarities to other such lists (e.g., Cliver & Svalgaard, 2004; Kappenman, 2005; Nevanlinna, 2008), but there are differences because we have made allowance for the variation with time of year of the Ap* response and, in the case of the Carrington event, the relationship between the Aa* and Ap* indices. Even quite small changes in the estimated magnitude of the storm day can have a large effect on its ranking order. The major surprise is that the positions of both the Carrington and STEREO events in the list is somewhat lower than in other lists if we correct for the time-of-year dependence of ap (estimate 1, [ApC*]MAX). This raises the question as to whether this correction should be applied to these events or not. Logically, there is no doubt that it should be as equation B3 converts the Aa* estimate into an Ap* value that should then need correcting to become ApC*. The main argument for not applying the correction is that the original Aa* estimate is a proxy compiled from other sources. That these sources are largely European sector midlatitude observatories and Ap is heavily weighted to midlatitude European station data does argue that this correction should indeed be applied. However, there remains great uncertainty in the true magnitude of the Carrington event. We also note that [ApC*]MAX is almost certainly not a fully adequate metric of this superstorms because it does not take account of the fact that the Carrington event on 3 September was in the middle of an extended interval of very high geomagnetic activity between 28 August and 5 September, and this almost certainly drove excessively large negative Dst values through the integrated effect on the ring current population, giving the large (–700 nT) persistent negative deflection recorded at the lower-latitude Colaba observatory in Mumbai following the short-lived and huge (–1500nT) initial impulse.

Lockwood et al. (2017) estimated the annual mean power input into the magnetosphere <Pα>τ from the reconstructed solar wind and interplanetary field variables derived by Owens et al. (2017), and from this Lockwood, Owens, et al. (2018) have estimated that the annual mean of ap for 1859 was 10.98. Hence, the estimated peak [ApC*]MAX/<apC>τ=1year for the Carrington event is 19.5 ± 2.1 (shown by the solid orange line in the lower panel of Figure 1) for the corrected data and [Ap*]MAX/<apC>τ=1year = 25.9 ± 2.7 for the uncorrected value (the orange dash-dotted line). From the observed <apC>τ=1year of 9.20 for 2012 the [ApC*]MAX/<apC>τ=1year for the STEREO event would have been 23.0 ± 2.5 (shown by the green line in the lower panel of Figure 1) and [Ap*]MAX/<apC>τ=1year = 30.9 ± 3.3 for the uncorrected data (green dash-dotted line). These ratio estimates are much larger values than for the observed 13 November 1960 event, for which [ApC*]MAX/<apC>τ=1 year is considerably lower, being 10.51 because it occurred during the most active geomagnetic year on record. Table S7 of the supporting information shows that the largest value of [ApC*]MAX/<apC>τ=1year in the observational record (since 1932) is 16.27 for 8 February 1986 (for which [ApC*]MAX = 203, the seventh largest value). This is the outstanding example in the observational record of a big storm being observed very close to sunspot minimum; however, its [ApC*]MAX/<apC>τ=1year ratio is still very much smaller than that estimated for the Carrington and STEREO events. In their absolute corrected ApC* values or uncorrected Ap* values, the Carrington and STEREO events appear to be comparable with, or somewhat larger than, the largest events seen since 1932; however, they arose in years of relatively low average activity and so are wholly exceptional in their ApC*/<apC>τ=1year and Ap*/<apC>τ=1year values.

Figure 1 demonstrates why the description of superstorms requires more than an extrapolation of the core and hence needs the application of EVS. However, there may still some valuable information on extreme events to be obtained from the core distribution, as Love (2012) and Love et al. (2015) have demonstrated for large geomagnetic storms (as defined and quantified using the Dst geomagnetic index). The points in Figure 2 show the available 31040 24-hr [ApC*]MAX samples as a function of the annual mean of the year in which they occur: the cyan points are the top 100 days (0.32%) in terms of [ApC*]MAX value (shown by the short vertical cyan lines in Figure 1); the mauve points are the top 6 days (0.02%, shown by the short vertical mauve lines in Figure 1); and the gray points are the remaining 99.68%. Figure 2 stresses how much our understanding rests rather critically on the estimates of the 1859 and 2012 superstorm values (the orange and green squares being the uncorrected values and the triangles being the corresponding corrected values). If we do not consider these two events and look just at the observed record since 1932, we see a quite strong relationship between the largest value seen in the year and the average value for that the year with the data points falling in the bottom right half of the plot. The corrected [ApC*]MAX values for the 1859 and 2012 superstorms (the orange and green triangles) are close to being in line with this relationship, especially the lower values of the uncertainty range. These values suggest that the occurrence of extreme superstorms is (weakly) related to the average activity in those years and that the extreme events are forming something like the negative Weibull distribution pileup toward a maximum possible value not much greater than that for the November 1960 event. However, the uncorrected values, [Ap*]MAX (shown by the green and orange squares), appear to be a completely different class of event from the events seen after 1932 and not obeying any sort of relationship between the peak and mean values. We should here also note that it is possible that even these uncorrected values are underestimates (being based on the Cliver & Svalgaard [2004] estimate of Aa*) that have been limited by procedure of quantizing the available data into k-index bands (see Lockwood, Chambodut, et al., 2018). Thus, the uncertainty in the estimated severity of the Carrington and STEREO events becomes crucial. On the other hand, the lower estimates for the Carrington and STEREO events suggest that the annual mean value and the core distribution could be helpful in quantifying the probability of the extreme events.

Details are in the caption following the image
The largest ApC* values in a calendar day, [ApC*]MAX, as a function of the annual mean for the calendar year of that day <apC>τ=1year for 1932–2016 (inclusive). The gray points make up 99.68% of the available 31047 daily [ApC*]MAX samples, the cyan points being in the top 100 days in terms of their [ApC*]MAX value (also shown by the short vertical cyan lines in Figure 1), and the mauve points the 6 days in the top 0.02% (shown by the short vertical mauve lines in Figure 1). The top 100 days are listed in the supporting information. The orange and green triangles show the estimated [ApC*]MAX values for the Carrington and STEREO-A events (in 1859 and 2012, respectively, see text for details), and the orange and green squares show the corresponding uncorrected [Ap*]MAX values. The uncertainty bars arise only from the conversion of Aa* to Ap* and do not include any uncertainty in the Aa* estimate. The horizontal dashed line is Apo, the 95th percentile of all ApC* values. The colored tickmarks along the x axis mark the annual means of the four annual distributions shown in Figure 1 (from left to right 2009, 2012, 1859, and 1960), using the same color scheme.

Even if the former proves to be the case and annual means are of no assistance in predicting superstorms, characterizing the core of the distribution (as opposed to the extreme tail) is, however, still important in space weather applications where the integral of the space weather activity is of relevance and the threshold to the effect is not in the extreme tail. Examples would include the effect of GICs on pipeline corrosion (Boteler, 2000; Cole, 2003; Gummow, 2002; Ingham & Rodger, 2018; Pirjola, 2005; Pirjola et al., 2005; Pulkkinen et al., 2001; Viljanen et al., 2006); the effect of GICs on power grid transformer degradation (Gaunt, 2016; Kappenman & Radasky, 2005); the effect of energy deposition in the upper atmosphere on the orbits of LEO satellites and space debris (Doornbos & Klinkrad, 2006); and the effect of integrated radiation dose on the degradation of spacecraft electronics (Baker, 2000; Fleetwood et al., 2000). In all these examples, although the extreme superstorm events have a large effect, they are rare and a much larger number of smaller events, described by the core distribution, can also have a significant integrated effect. Lastly, we note that Chapman et al. (2018) have recently studied the extreme event tails in several terrestrial disturbance indices during recent maxima of the solar cycle and fitted generalized Pareto distributions. They found that if the mean and variance of the large-to-extreme observations can be predicted for a given solar maximum, then a relationship between the core distribution and the extreme tail can be found giving a description of the full distribution. Thus, it does appear possible that the study of the core of the distributions presented here could be extended to characterize the extreme tails: this will be the subject of a future study.

As mentioned above, the [ApC*]MAX values are unlikely be the best indicators of all storm characteristics, in particular in relation to the ring current and the Dst geomagnetic index. This gives another reason why we should study the core of the distributions, associated with storm preconditioning and the fact that the best predictors of large Dst storm occurrence are time integrated over long intervals (several days, Borovsky, 2017; Lockwood et al., 2016). The largest and most disruptive geomagnetic storms tend to be the longest lived (Balan et al., 2016; Echer et al., 2008; Mourenas et al., 2018). Many large and long-lived storms show a two-step development (Tsurutani et al., 1999; Xie et al., 2006); however, these multistep storms have been shown not originate from just a simple superposition of individual events (Chen et al., 2000; Kozyra et al., 1998, 2002) and it is not yet fully clear how the implied preconditioning originates. Kozyra et al. (1998) argued that prior energetic particle injections are swept out of the dayside magnetopause as the second population from the plasma sheet moves into the inner magnetosphere and so suggested that the preconditioning occurs in a multistep storm through the cumulative effects of the successive storms on the population in the source plasma sheet (Chen et al., 2000; Kozyra et al., 1998, 2002). Alternatively, it has been suggested that prior storms prime the inner magnetosphere through O+ ions injected from the ionosphere (Daglis, 1997; Hamilton et al., 1988). Lockwood et al. (2016) have shown that the key element in driving the largest storms (as measured by the Dst index) is not so much the peak magnitude of the interplanetary coupling function, rather the timescale over which it applies—large storms being a response to forcing that is both large and sustained over several days. (In other words, very large interplanetary coupling function values do not drive major storms if they persist for only short intervals). Borovsky (2017) reached the same conclusion in relation to the damaging relativistic electron fluxes generated in the largest storms. Thus, there is likely to be some information in the core of the distributions that could be exploited to predict the occurrence of the long-lived and extreme events. Lastly, we also note that Kauristie et al. (2017) have also looked at the core distributions of ap, Dst (as well as am and dDst/dt), not with a view to identifying highly disturbed periods and large and extreme events, rather the opposite—to find the quietest intervals that could be used to generate an empirical model of the undisturbed main field.

1.2 Construction of a Space Weather Climatology

A number of techniques that have been developed and refined for terrestrial meteorological and climate studies are now being deployed in the field of space weather. In addition to EVS discussed above, these include Numerical Weather Prediction (Pizzo et al., 2015); data assimilation (Barnard et al., 2017; Lang et al., 2017; Lang & Owens, 2018; Siscoe & Solomon, 2006; Schunk et al., 2014); cost-loss analysis (Henley & Pope, 2017; Owens & Riley, 2017); ensemble forecasts (Knipp, 2016); climate analog forecasts (Barnard et al., 2011); ensemble climate reconstructions (Owens et al., 2016a, 2016b); skill scores (Balch, 2008); and several others. In meteorology, many of these techniques are used in conjunction with a climatology that describes statistically the probability of a relevant variable at key locations having one of the full potential range of values. Climatological forecasts assume that the future of a system can be determined from these statistical properties of the past behavior of that system. These will clearly often be rendered invalid by long-term changes in the system that are not covered by the climatology. This limitation to climatological forecasts can actually be useful because deviations from climatological forecasts (anomalies) can be used to detect and quantify the effects of the long-term changes. Note that long-term changes can also generate false conclusions about, for example, skill scores or event occurrence, if they are neglected (e.g., Hamill & Juras, 2006).

There are four elements that we need to generate a useful climatology of space weather for each of the key variables: (1) the mean value (over a convenient period such as a year), (2) the core distribution of values about that mean, (3) the extreme tail of the distribution (giving the repeat period of superstorms), and (4) the autocorrelation function, ACF. All these would be available to us, if we possessed the time series at high enough temporal resolution and over an interval long enough that adding any more data does not significantly alter the distribution. This approach has been employed by Matthes et al. (2017) to build a space climatology using the aa index geomagnetic that extends back to 1868. Unfortunately, as discussed below, this does not include the grand minima conditions such as existed during the Maunder minimum (Usoskin et al., 2015) that we know from cosmogenic isotopes to have prevailed for extended periods roughly 30 times in the last 9000 years (e.g., Barnard et al., 2011). These four elements would enable us to evaluate integrated deterioration of systems influenced by space weather, the probability of an event over a certain size and the probability of multiple events that may have a greater effect than the sum of the effect of the events individually. There is great emphasis in space weather on protecting systems from the largest events or, at least, evaluating the risk posed by those events. However, evaluating the distribution core and mean and the probabilities of quiet conditions is also important to avoid the cost and other wasted resources associated with overengineering systems (such that they become obsolete long before they are lost or degraded) and so ensuring that the designs are cost effective. As pointed out by Henley and Pope (2017), the development of a useful space weather climatology, as with forecasting procedures, requires a detailed dialog with the system design engineers and end users.

The biggest problem in trying to assemble a space weather climatology is the long timescales of the variations (Henley & Pope, 2017). The primary periodicity in space weather is the solar cycle oscillation the period of which averages about 11 years. Since in situ observations of the near-Earth space environment began, we have accrued direct space weather data for just four such cycles. To put this in context, consider a terrestrial tropospheric weather climatology: the dominant periodicity is 1 year and a climatology based on just 4 years would not be of much value for most applications. Hence, as pointed out by Lockwood (2003), we need to extend the interval by using other measurements and inferring the space weather variables, rather than just using the direct measurements.

The most direct way of doing this is to employ geomagnetic activity observations, as used by Matthes et al. (2017). In theory these could extend back to 1832, when Gauss established the first well-calibrated geomagnetic observatory in Göttingen. Reviews of the development of the observation of geomagnetic activity have been given by Stern (2002) and Lockwood (2013). Some composites have used geomagnetic activity data from soon after the establishment of Gauss' observatory; for example, Svalgaard and Cliver (2010) used regressions with different types of geomagnetic data to extend the sequence back to 1835. However, there are concerns about the calibration, stability, and homogeneity of the earliest data (Lockwood, 2013).

Geomagnetic activity on annual timescales depends on both the solar wind speed VSW and the interplanetary magnetic field (IMF) field strength, B, and the first separation of the two was made by Lockwood et al. (1999) using two different geomagnetic indices (the aa index and Sargent's recurrence index derived from aa). Later, Lockwood et al. (2014) used four different pairings of different indices to derive VSW, B and the open solar flux, with a full Monte Carlo uncertainty analysis, back to 1845. From this date, the geomagnetic data give us almost 17 full solar cycles, considerably more useful than the 4 available in direct observations but still not enough for a full climatology that allows for centennial scale solar change. Crucially, this interval does not include the Maunder minimum (or even the lesser Dalton minimum) and hence a climatology based on geomagnetic data would not cover grand minimum conditions or even periods like the Dalton minimum.

Recent advances allow us to start to construct a climatology based on sunspot numbers which are available with reasonable regularity from about 1612, soon after the invention and patenting of the telescope in 1608. Owens et al. (2017) have used the sunspot number data in conjunction with modeling to reconstruct the solar wind number flux NSW, as well as B and VSW from 1615 onward. This has enabled Lockwood et al. (2017) to reconstruct the annual mean power input into the magnetosphere from 1615 and from this Lockwood, Owens, et al. (2018) have estimated the annual means of the ap index. These advances make it possible to construct elements of a climatology which extends over 30 clear solar cycles as well as the 50-year break to normal solar cycles during the Maunder minimum. During the Maunder minimum, the modeling predicts 8 small-amplitude, smaller-period cycles which show a different phase relationship with the weak cycles in sunspot numbers. Owens et al. (2012) have shown evidence for these small Maunder-minimum cycles in galactic cosmic ray fluxes.

In addition to the increased number of solar cycles, these reconstructions that extend back to the early 17th century cover both a grand minimum (the Maunder minimum [Usoskin et al., 2015]) and the recent grand solar maximum (Lockwood et al., 2009). There is also potential to even extend the climatology to cover up to 9000 years, covering 24 grand maxima and 30 grand minima, using cosmogenic isotope abundance measurements which generally require decadal averages or which are smoothed by the time constants of the isotope deposition into the terrestrial reservoirs where they are measured. Barnard et al. (2011) have discussed a method for temporal scale changing from these decadal-scale data to annual means. At the present time we are lacking one key element, namely a way to determine the times of solar cycle minimum and/or maxima and hence the phase of the solar cycle of each year.

In paper 1 of this series of 3 papers (Lockwood et al., 2018a), we showed that the total power input into the magnetosphere Pα can be computed using a constant coupling exponent α that does not depend on the averaging timescale τ (previous studies that had suggested it did were adversely influenced by data gaps). Paper 2 (Lockwood et al., 2018b) studied how the core distributions of Pα on timescales of 3 hr and less arise. In the current paper we study how and why these distributions in Pα evolve with averaging timescale τ and the subsequent evolution with τ of the ap (section 2.3) and Dst (section 2.4) geomagnetic indices. In each of these two sections we develop an algorithm that allows the core distribution for that geomagnetic index to be evaluated for a given mean value and at a required timescale, τ. The formulae required to implement these algorithms are given in Appendix A.

2 Distributions of Power Input to the Magnetosphere and Geomagnetic Indices

Figure 3 studies the evolution with averaging timescale τ of the distribution of three space weather indicators. The left-hand panels show the power input into the magnetosphere, computed from the near-continuous interplanetary data for 1996–2016 (inclusive) and normalized to the mean value over the calendar year, <Pα>τ/<Pα>1year. The central panels show the normalized geomagnetic ap index, <ap>τ/<ap>1year from the full data set available (for 1932–2016) and the right-hand panels how the normalized negative geomagnetic Dst index, <Dst′>τ/<Dst>1year, (where Dst′ is defined below), again using all the available data (for 1957–2016).

Details are in the caption following the image
Distributions of (left-hand panels) normalized power input into the magnetosphere, <Pα>τ/<Pα>1year; (central panels) normalized geomagnetic ap index, <ap>τ/<ap>1year; and (right-hand panels) normalized negative geomagnetic Dst index, <Dst′>τ/<Dst>1year. The coupling function of α = 0.44, shown in Paper 1 to apply at all τ, is used to generate Pα. The distributions are of the means taken over intervals τ long, divided by the annual mean of all samples in that year. The blue histograms are the observed distributions, with samples binned into 150 contiguous bins centered on k.x98/100 where k is varied between 0.5 and 149.5 in steps of 1 and x98 is the 98th percentile of the cdf, and the numbers of samples n are then normalized such that (x98/100)Σn is unity. The black lines shows the best fit lognormal distributions, and the mauve lines are the best fit Weibull distributions (with mean value m = 1 in the cases of Pα and ap and m = Rm(τ) for Dst′). Fits are made using Maximum Likelihood Estimation (see supporting information). The total number of available samples, N, is given in each panel. (a–c) For τ = 1 year; (d–f) for τ = 0.5 year; (g–i) for τ = 27dy; (j–l) for τ = 7 day; (m–o) for τ = 1 day; and (p–r) for τ = 3 hr. The Pα data are from 1996 to 2016 (inclusive), the ap data for 1932 to 2016 (inclusive) and the Dst′ data are for 1957 to 2016 (inclusive). <Dst>τ≥0 samples are omitted giving Dst′ (so because all <Dst>1year values are negative, these give <Dst′>τ/<Dst>1year ≥ 0) in histograms and distribution fits. As a result, N for Dst′ is 100%, 99.17%, 94.08%, 88.42%, 80.60%, and 78.48% of all Dst samples for τ of, respectively, 1 year (c), 0.5 year (f), 27 days (i), 7 days (l), 1 day (o), and 3 hr (r). The best fit distribution parameters, goodness-of-fit metrics, and cdf and pdf plots are given in the supporting information for these two fitted distributions and five others. cdf = cumulative distribution function; pdf = probability density function.

The coupling function of α = 0.44, shown in Paper 1 (Lockwood et al., 2018a) to apply at all τ, is used with the equation of Vasyluinas et al. (1982) to generate Pα (described in Lockwood et al., 2017, Lockwood, Owens, et al., 2018; Lockwood et al., 2018b). The ap index responds primarily to the substorm current wedge (see Lockwood, 2013) and the Dst index primarily to the ring current. However, Dst is importantly also influenced by other currents (e.g., Turner et al., 2000) such as the Chapman-Ferraro currents in the magnetopause and so also varies with compressions of the dayside magnetosphere by solar wind dynamic pressure enhancements. The ring current effect dominates meaning that Dst is increasingly negative as activity increases but the dynamic pressure effect mean that positive Dst value can occur. Corrections for the effect of solar wind dynamic pressure on Dst, via magnetopause currents, have been developed (Consolini et al., 2008; O'Brien & McPherron, 2000) but we do not use them here, mainly because it reduces the available data set to after 1996 (when quasi-continuous interplanetary data are available) but also because a great many papers have used the uncorrected Dst index to characterize magnetic storms in the past. The fact that Dst, unlike ap (or Pα), can have either sign generates a fundamental difference between the ap and Dst indices when trying to formulate a long-term climatology: when activity is low ap tends to a limiting value of zero whereas Dst tends toward a distribution of values spread around zero. Half-wave rectifying Dst so that positive values are put to zero is not an option as this generates a large number of samples at zero that distorts the distribution. Instead we here treat Dst ≥ 0 as data gaps (we here call the index so derived Dst′) which yields an index that correlates much better with multiplicative interplanetary coupling functions (Lockwood, 2013). However, such samples are still included in the total number when computing the occurrence probability of a large negative Dst value. Note that using Dst′ instead of Dst is purely a measure that gives us a unipolar activity index to work with (which makes the modeling required much less complex) and is not, in any way, a correction for magnetopause currents. Of course, even strongly negative Dst values will still be influenced by magnetopause currents to some extent, which is why Dst is an imperfect metric of ring current storms. In a later paper we will present a separate model for predicting the distributions of the pressure-corrected index, Dst*, as a function of τ. Note that Dst* also has both positive and negative values (see Figures 1 and 2 of Consolini et al., 2008) and so the same sort of techniques will be required for the construction of a model for Dst* as are used here for Dst.

To summarize the procedure employed here: we make normalized values of the variable X for 1966–2017 (inclusive), where X is one of the observed variables Pα, ap, and Dst′ for a given averaging timescale τ (also done for the synthesized variables XR and XRF that are used below to clarify the behavior of the observed variables). We normalize by dividing by the arithmetic mean for the calendar year of the sample <X>τ=1year. From these normalized values we derive the distribution of X/<X>τ=1year for all 22 years studied. This distribution has an arithmetic mean m = 1 which is the grand mean or (the mean-of-means) of the 22 annual normalized data subsets and which applies because we have, to a good approximation, the same number of samples in each year. We then fit model pdfs so that we can empirically model the probability of X/<X>τ=1year which is the probability of X for a given <X>τ=1year, that is, P(X|<X>τ=1year). Hence this enables us to achieve our goal of empirically modeling the distribution of X for a given <X>τ=1year. We wish this fitted distribution to reproduce the observed one as closely as possible so we use model distributions of means of μ = m = 1 and find the optimum variance v using Maximum Likelihood Estimation. Some of the distributions fitted are described by shape and scale parameters instead of μ and v and these are constrained so that μ is unity. The procedure is repeated for the full range of averaging timescales, τ.

The blue histograms in Figure 3 are the observed distributions, the black lines shows the best fit lognormal distributions and the mauve lines are the best fit Weibull distributions (both with mean value μ = 1 in the cases of Pα and ap and μ = Rm(τ) for Dst′, where Rm deviates from unity because in Dst′ we treat each <Dst>τ ≥ 0 sample as a data gap: the factor Rm(τ) is discussed further later). The blue histograms were generated by counting the number of samples in 150 contiguous bins centered on k.x98/100, where k is varied between 0.5 and 149.5 in steps of 1 and x98 is the 98th percentile of the distribution. The numbers of samples n in each bin then normalized so that Σn(x98/100) is unity. Fitting directly a distribution to these histograms gives results which, in general, depend on the bin width adopted (e.g., Woody et al., 2016) and so we here fit distributions using Maximum Likelihood Estimation (MLE) which does not require prior binning of the data into bins of arbitrarily chosen width. A basic description of MLE fitting, and of goodness of fit metrics (both absolute and relative) is given in the supporting information. Plots of best fit pdfs and cumulative distribution functions, and tables of best fit distribution parameters and goodness of fit metrics are also given in the supporting information for seven standard distribution forms: the normal (Gaussian) distribution, the lognormal distribution, the Weibull distribution; the Burr distribution, the Gamma distribution, the Log-logistic (Fisk) distribution, and the Rician distribution. For all these distributions the number of degrees of freedom is df = 2, except the Burr for which df = 3.

The top row in Figure 3 is for averaging timescale τ = 1 year and the rows beneath are, successively for τ of 0.5 year, 27 days, 7 days, 1 day and 3 hr (0.125 day). The omission of positive <Dst>τ samples has no effect for τ = 1 year (as all values are negative), but the number of Dst′ samples is 99.17%, 94.08%, 88.42%, 80.60%, and 78.48% of all Dst samples for τ of, respectively, 0.5 year, 27 days, 7 days, 1 day and 6 hr. Because of the normalization, the distributions for τ = 1 year are, by definition, delta functions at unity. At general τ, the distributions for <ap>τ/<ap>1year are always close to lognormal in form (the black lines) the variance increasing with decreasing τ (see supporting information for goodness-of-fit evaluations). At the larger τ, the low variance lognormal distributions are essentially Gaussian in form. On the other hand, the Dst′ distributions are equally well fitted by the Weibull, Gamma or Log-logistic families of distributions (see supporting information) and in Figure 3 we show the Weibull distributions (the mauve lines), again with variance increasing with decreasing τ. Note that for Dst′, significantly better fits could be obtained using a distribution with an extra degree of freedom, such as the Burr (see supporting information). The difference between ap and Dst′ is caused by the flatter and broader distribution at small <Dst′>τ/<Dst>1year values. The <Pα>τ/<Pα>1year distributions are lognormal in form for τ greater than about 2 days, but at lower τ these distributions are increasingly Weibull like in form. The origin of a Weibull form at low τ was discussed in Paper 2 (Lockwood, Bentley, et al., 2018b) and is associated with the variability of the IMF orientation factor on these timescales, via the quasi half-wave rectification effect of the southward component of the IMF on solar wind-magnetosphere coupling. Note that because of the smoothing effect of the magnetospheric energy storage/release system, the Weibull distribution of power input to the magnetosphere for small τ yields a lognormal distribution in power input on the timescales relevant to ap and hence in ap itself.

The evolution of the distributions shown by the different rows of Figure 3 reveals the Central Limit Theorem (hereafter CLT) in action (Heyde, 2006; Fischer, 2011; Wilkes, 1995). This states that when independent random variables are added, their properly normalized sum tends toward a normal distribution. It applies in this context because the key operation in taking an average value is summation and because, as τ is increased in relation to the correlation timescale, an increasing fraction of the samples is independent.

2.1 The Evolution of the Distributions With Timescale for ap and

Figure 4 looks in more detail at the evolution of the distributions of <Pα>τ/<Pα>1year (for α = 0.44) as a function of the logarithm of the averaging interval. The upper plot shows the pdf color coded as a function of log10(τ) and <Pα>τ/<Pα>1year such that the distributions shown in the left-hand plots of Figure 3 are vertical slices of Figure 4. The blue line in the lower panel shows the corresponding variation of the distribution variance v (also on a logarithmic scale). Figure 5 is the corresponding plot for <ap>τ/<ap>1year.

Details are in the caption following the image
(a) The variation of the observed distributions of the normalized power input into the magnetosphere <Pα>τ/<Pα>1year for α = 0.44 as a function of the logarithm of the averaging interval, log10(τ). The left-hand edge of the plot is at τ = 3 hr, the right-hand edge at τ = 1 year, and the vertical black lines show τ of 6 hr, 1 day, 7 days, 27 days, and 0.5 year. (b) The logarithm of the best fit variance of the lognormal distribution (of mean value m = 1), log10(v), also as a function of log10(τ). cdf = cumulative distribution function; pdf = probability density function.
Details are in the caption following the image
Same as Figure 4 for the normalized ap geomagnetic index, <ap>τ/<ap>1year. The distributions for τ < 9 hr are not shown as the quantization of 3-hourly ap levels becomes a factor.

In the supporting information, the distributions shown in Figure 3 are fitted with seven distribution forms, six or which are characterized by two parameters (either the mean, m, and variance, v, or a pair of parameters that are defined by m and v; note the seventh distribution form used, the Burr, has an additional shape parameter and is included to test if this gives a statistically significant improvement to the fit). Two of the distributions, the Gaussian and the Rician, do not give good fits at low τ but do quantify the evolution of the distributions toward a Gaussian-like form as τ is increased toward 1 year. Because we here look at the distributions of normalized disturbance metrics <X>τ/<X>1year (in this paper we consider X of Pα, ap, and Dst) the mean m is, by definition, always unity, and hence, we only need to study the behavior of the variance, v, shown in Figure 4b for <Pα>τ/<Pα>1year and in Figure 5b for <ap>τ/<ap>1year.

2.2 The Effect of Autocorrelation on the Evolution of Distributions

To help understand Figures 4 and 5, Figure 6 shows the evolution with increased τ for a synthesized variable XR that is selected at random at time resolution τ = 3 hr from a Weibull distribution with k of 1.0625 and λ of 1.0240 (giving a mean m = 1), which in Paper 2 (Lockwood et al., 2018b) was shown to be good fit to the distribution of <Pα>τ/<Pα>1year at that timescale. The general pattern of evolution of the pdfs of <XR>τ/<XR>1year in Figure 6a is like that in Figures 4a and 5a, other than that the distributions evolve toward a delta function at unity with increasing τ rather more rapidly for XR. This is also reflected by the mauve line in Figure 6b, which shows that the variance, v, falls more rapidly than the blue and red lines in Figures 4b and 5b for Pα and ap, respectively. The initial distribution in Figure 6 is a Wiebull form but even at τ as low as 9 hr it has evolved into a lognormal form, which it keeps at all greater τ (but the variance falls so it approaches a Gaussian near τ = 1 year). This evolution of the distribution form is the same sequence that Pα follows.

Details are in the caption following the image
The same as Figure 4 for a random variable XR of the same length and time resolution as the Pα data series and which for τ = 3 hr is drawn from a Weibull distribution with k of 1.0625 and λ of 1.0240, which in Paper 2 (Lockwood et al., 2018b) was shown to be good fit to the distribution of Pα at that timescale. pdf = probability density function.

The mauve line in Figure 7 shows the ACF (the autocorrelation at lags of 3 hr and the resolution of the synthetic data) of the random variable XR employed in Figure 6. It can be seen that XR is indeed completely random as the ACF falls to zero at lag 1. To investigate the effect of autocorrelation, we generate a second random distribution that we then pass through a smoothing filter to give it autocorrelation. This generates a synthetic data series XRf. Because the filter has a similar effect on the distribution as averaging we have to draw the original random distribution from a higher-variance Weibull. By iteration we find that for the filter we use, an initial Weibull random distribution with k of 0.2800 and λ of 0.0778 (giving m = 1) generates an almost identical distribution at τ = 3 hr after filtering to that of XR used in Figure 6. The filter used is a triangular-weighting moving-average filter with two response peaks. The first is a [1-3-5-3-1], around lag δt = 0 that adds short-range correlation into the XRf data series. The second is a [1-2-3-4-5-6-7-8-7-6-5-4-3-2-1] × (5/8) triangular response peak centered on lag 216 (for the 3-hr resolution XRf data series, this second peak is at lag 27 day). The black line in Figure 7 shows the ACF of XRf, and it can be seen that the filter has introduced short-term autocorrelation on lags up to about 1 day and a 27-day (the mean solar rotation period seen from Earth) recurrence.

Details are in the caption following the image
The autocorrelation functions (ACFs) of (in mauve) the random variable XR employed in Figure 6 and (in black) the filtered random variable XRf employed in Figure 8. The ACF, at), is computed for lags Δt between zero and 1 year in steps of the data resolution (δt = 3 hr) and are shown as a function of log10t + δt) where Δt and δt are both in units of days. (the δt is added to Δt to allow the zero lag point to be shown on a logarithmic scale). The left-hand edge of the plot is at Δt = 0 and the right-hand edge at Δt = 1 year, and the vertical gray lines are at lags Δt of 1 day, 7 days, 27 days, and 0.5 year. Lag 1 (Δt = δt) is at −0.602 on the x axis.

Figure 8 shows the equivalent plot to Figure 6 for the XRf data series. Figure 8a shows that the effect of the autocorrelation is to slow the progression toward the delta function at unity. This is expected from the CLT because the autocorrelation means that larger averaging timescales are needed before samples are sufficiently uncorrelated for the CLT to apply. Figure 8b shows the variation of the variance v for XRf in black and compares it with that for XR (in mauve) from Figure 6b. It can be seen that at the τ where autocorrelation has been introduced into the XRf series by the filter, the variance falls less quickly than for the random series, XR. At all τ the distribution of XRf is lognormal in form and mirrors the evolution for ap. Note that Figures 7 and 8, and the results for a random and a smoothed-random data series (XR and XRf), are included here only to illustrate how autocorrelation influences the form of the evolution of the distribution with τ and also influences the dependence of variance v on τ. They are not used again in the derivation of a model of the distribution at a given τ. Instead, we fit the v(τ) variation derived directly from data with a polynomial in τ.

Details are in the caption following the image
The same as Figure 6 for a random variable XRf that has been drawn from a Weibull distribution and then passed through a filter to generate the short-term persistence and the 27-day recurrence shown by the autocorrelation function in black in Figure 7 (see text for details of the filter). In order that the distributions of XRf and XR have the same variance at τ = 3 hr (with unity mean), the effect of the filter means that before filtering the distribution must be drawn from a higher-variance Weibull distribution (with unity mean) than XR with k of 0.2800 and λ of 0.0778. The black line in Figure 8b shows the evolution of the variance, v, (on a logarithmic scale) with τ for XRf, and the mauve line is the same variation for XR, as shown in Figure 6b. pdf = probability density function.

2.3 Modeling the Evolution of Distribution of ap With Increasing Timescale

The section describes how we model the evolutions of the distributions of <ap>τ/<ap>1year with increasing τ, and Figure 9a presents the results for that modeling, aimed at reproducing Figure 5a. Figure 9b shows the log-log plots of variance v, as a function of τ from Figures 4b, 5b, and 6b using the same color scheme, that is, for Pα in blue, for ap in red, and for the random variable, XR, in mauve. Also shown, in cyan, is the variation for the 150-year data series of the aa geomagnetic index. The black line is a polynomial fit to the ap variation, given by equations A11 and A12 of Appendix A that yield the variance, v(τ). The maximum likelihood analysis given in the supporting information (on which Figure 3 is based) shows that for <ap>τ/<ap>1year the observed distribution at all τ is best fitted with a lognormal form with mean m = 1. (That is until τ approaches 1 year when the distribution becomes nearly Gaussian in form and the goodness-of-fit metrics for all seven distributions become very similar). Figure 9a shows the modeled lognormal distributions using the polynomial fit to the variance variation shown in Figure 9b. The equations for reproducing the distribution for a given τ are given in section A1.. From this, the pdf of <ap>τ (and hence that of the time integral of the activity τx<ap>τ) at a given τ can be computed for a known annual mean <ap>1year.

Details are in the caption following the image
Same as Figure 5 for a model X based on lognormal distributions and a sixth-order polynomial fit to the variance of ap, v(τ). In Figure 9b the red line shows v(τ) for ap (on a logarithmic scale), and the black line is the polynomial fit (see Appendix A for the polynomial coefficients and formulae for the lognormal distribution family). Also shown are the v(τ) variations for other variables using the same color scheme as used in Figures 4b, 5b, and 6b: Pα (in blue); random variable, XR (in mauve), plus the aa geomagnetic index (in cyan). pdf = probability density function.

The cyan line in Figure 9b is for all the full aa index data set that covers the interval 1868–2017. The close similarity of the v(τ) relationship to that for the ap data (1932–2017, the red line) strongly indicates that this relationship has not varied significantly over the past 150 years. To check this in more detail, the aa data have been divided into three 50-year intervals (1868–1917, 1918–1967, and 1968–2017, inclusive), and the v(τ) relationship for these three data subsets are plotted in Figure 10b as green, blue, and red lines, respectively, and can be seen to be very similar (and to that for the overall aa plot in Figure 9b). Figure 10a studies the ACF of the aa/<aa>τ=1year data for these three intervals. The three are again very similar showing the persistence effect at low τ (up to about 5 days), a recurrence peak at 27 days, plus some weak harmonics of the 27-day variation, and hence are very similar to that for the smoother random variable, XRf, in Figure 7. In fact, the ACF for XRf could easily be made to match the observed ACFs for aa shown in Figure 10 very closely, if the smoothing filter used were adjusted to give slightly lower persistence at low τ (< 1 day) and the response peak around 27 days were to be broadened somewhat. There is also a small but marked and persistent diurnal signal visible in Figure 10a. The main difference between the three intervals is that the 27-day peak is a little bit larger for the earliest interval (1868–1917) and the low-τ persistence a little bit weaker. These differences cannot be identified in the v(τ) plots. The only other data that are continuous and high enough time resolution to potentially investigate this further back in time are the daily values international sunspot number R, which are almost continuous since 1818. However, sunspot numbers behave very differently to geomagnetic activity indices, showing sudden increases/decreases as spot groups rotate onto on/off the visible disk of the Sun and rises and falls as the groups wax or wane as they rotate across the visible solar disc: they do not have the bursty nature of Earth-directed interplanetary disturbances and hence of geomagnetic disturbances. Hence, they cannot help us investigate the ACF and the associated v(τ) relationship for near-Earth space and geomagnetic activity before the start of regular, well-calibrated geomagnetic observations.

Details are in the caption following the image
(top) The autocorrelation function of the 3-hourly aa index, divided into three 50-year intervals: (red) 1968–2017 (inclusive); (blue) 1918–1967; and (green) 1868–1917. The lower panel shows the relationship of the variance v of the lognormal distribution of <aa>τ/<aa>τ=1year as a function of the averaging timescale (on the log-log plot format used in panel b of Figures 4-9).

Figure 11 investigates if ACFs and variances for aa shown in Figure 10 vary with sunspot number. We use the international sunspot number, R, derived and distributed by WDC-SILSO (World Data Center for Sunspot Index and Long-term Solar Observations), Brussels. We take 3-year averages of the data to keep sample numbers high. For each period we evaluate the mean sunspot number, <R>τ=3 year, and the ACF of aa/<aa>τ=1year. These ACFs were then averaged together for contiguous bins of <R>τ=3year that are centered on values between 10 and 200 in steps of 20. In addition, the variance v of the distribution of all <aa>τ/<aa>τ=1year samples in each band of <R>τ=3year was computed for each averaging timescale τ. The top panel of Figure 11 shows a surface plot of the ACF as a function of log10(τ) and <R>τ=3year. On timescales below about τ = 25 days the ACFs hardly varies at all with the sunspot number. The major effect is on the peak at 27 days (and its harmonics) that has a larger amplitude when the sunspot number is low. The lower panel gives the corresponding surface plot of log10(v): note that sample numbers do not allow this analysis to extend to as great a sunspot number as for the ACF analysis. As would be expected from the ACFs, there is almost no variation in the v-τ relationship with sunspot number at τ below about 25 days but above this the larger ACF peak at 27 days for low sunspot number causes v to fall with τ slightly less rapidly than it does at higher sunspot numbers. There are some slight but persistent ridges and dips in the surface shown in Figure 11b at certain <R>, but the surface is remarkably independent of R. Note that the lack of any dependence of the v-τ relationship on sunspot number (at low τ) was also revealed by Figure 8c of Lockwood, Owens, et al. (2018), which plots distributions of <aa>τ=1day/<aa>τ=1 year as a function of year and no solar cycle variation can be detected.

Details are in the caption following the image
Surface plots of (top) the autocorrelation function, ACF, and (bottom) the logarithm of the variance, log10(v), for all the aa index data (1868–2017) as a function of the logarithm of the averaging timescale, log10(τ), and the mean international sunspot number, averaged over a 3-year interval, <R>τ=3years.

It is tempting to argue that we should modify the model form of the v-τ relationship at τ > 25 days to allow for the (weak) sunspot number variation seen at large τ in the lower panel of Figure 11. The major reason is that during the Maunder minimum the persistently low sunspot number might make this a factor. However, this is not necessarily the case because a prolonged (grand) sunspot activity minimum is in many ways quite different to a sunspot activity minimum between solar cycles: one major reason being that for the cycle minima there is residual open flux generated during the previous cycle out of which fast solar wind flows. The 27-day ACF peak is largely caused by CIRs (corotating interaction regions) caused by fast solar wind emanating from coronal holes reaching down to low latitudes, catching up with Earth-bound slow solar wind of the streamer belt. Modeling for the Maunder minimum predicts that the streamer belt will have been considerably wider than in modern times with coronal holes restricted to high heliographic latitudes (Lockwood & Owens, 2014a, 2014b; Owens et al., 2017), making CIRs that hit Earth less, rather than more, common. Hence, it is not at all clear that the effect noted in low sunspot years at τ > 25 days in Figure 11 will also apply to the Maunder minimum. For the present paper we assume that the v(τ) relationship does not change, and we fit it with a single polynomial form. However, should a long-term changes in the v(τ) relationship be discovered at some point in the future, it could be readily accommodated by making the fit polynomial coefficients a function of time.

Figure 12 shows that the modeled distributions shown in Figure 9a can explain the variation of occurrence of large events, as a function of the annual means discussed in Paper 2. The points in Figure 12a show probability that 3-hourly values of ap are in the ap top 5% of the overall distribution (for 1932–2016, 252152 samples), f[ap>apo] (i.e., ap exceeds its 95th percentile of all 3-hourly ap values, apo = 47.91), as a function of the annual mean value <ap>τ=1 year. The mauve line is the prediction for τ = 3 hr for the model values displayed in Figure 9a. The fit can be seen to be close. The family of model predictions of f[ap>apo] as a function of <ap>τ=1 year is shown in Figure 12b for timescales of 1 day (in blue), 7 days (in orange), and 27 days (in black). Hence, the model is reproducing the behavior noted in Figure 1 of Paper 2, namely that, with some scatter, the number of events in any 1 year that is in the top 5% of the overall distribution increases hyperbolically with the mean value for that year.

Details are in the caption following the image
Predictions by the model fit to the ap distributions with τ shown in Figure 9. (a) The points show probability that 3-hr values of ap are in the top 5% of the overall distribution (for 1932–2016, 252152 samples), f [ap>apo] (i.e., ap exceeds its 95th percentile of all 3-hourly ap values, apo = 47.91), as a function of the annual mean value <ap>τ=1year. The mauve line is the model prediction for τ = 3 hr. (b) The family of model predictions of f [ap > apo] as a function of <ap>τ=1year for timescales τ of 3 hr (in mauve), 1 day (in blue), 7 days (in orange), and 27 days (in black).

2.4 The Evolution of the Distributions With Timescale for Dst

Figure 13 is the equivalent plot to Figure 4 for the Dst′ data that extend from 1957 to 2016. Here the pdf is shown as a function of τ and <Dst′>τ/<Dst>1year. Generating a model fit to this plot is more complex because Dst does not converge to zero for low activity and we have to use Dst′ instead, where Dst′ is the same as Dst, but all positive values are treated as data gaps. In annual mean data, this makes no difference, because all annual means are negative, but with decreasing τ the number of Dst′ samples falls compared to the number of Dst samples, and the mean Rm of the distribution of <Dst′>τ/<Dst>1year, although unity at τ = 1 year, is greater than unity at lower τ because negative values of <Dst′ >τ/<Dst>1year (i.e., positive values of <Dst′>τ) are neglected. Figure 14a shows in red the variation with log10(τ) of fneg (= NDst/NDst), the fraction of Dst samples that are negative (the subset termed Dst′). The black line is a polynomial fit to this variation, which is given by equation A12 of Appendix A. The green line shows the corresponding variation of Rm, the mean of < Dst′>τ/<Dst>1year. Again the black line is the best polynomial fit given by equation A13 of Appendix A. Section A2. gives the algorithm for computing the pdf of Dst′ for a given Dst and timescale τ that allow for these two factors. Figure 15 corresponds to Figure 9 for the Dst index. As shown by Figure 3, the distributions of < Dst′>τ/<Dst>1year follow the Weibull family of distributions and these are derived from the best fit to the observed log10(v)-log10(τ) variation (shown in green in Figures 13b and 15b), using the polynomial fit given in black that is given by equations A10 and A11 of Appendix A. For comparison, Figure 15b also shows the log10(v)-log10(τ) variations for Pα (in blue), ap (in red), and the random variable, XR (in mauve).

Details are in the caption following the image
Same as Figure 4 for the normalized Dst geomagnetic index, <Dst′>τ/<Dst>1year where Dst′ is the subset of Dst values that are negative. pdf = probability density function.
Details are in the caption following the image
The variation with averaging interval τ of (top) the fraction of Dst samples that are negative (the subset termed Dst′) and (bottom) the mean of the ratio of the mean value of Dst′ in intervals of duration τ, to the annual mean values of Dst. (Top) fneg = NDst′/NDst is shown as a function of log10(τ), where NDst is the number of samples at that τ for which Dst ≤ 0 and NDst is the number of Dst samples of either sign. The red line is the mean for all Dst samples (from 1957 to 2016), the black line is best polynomial fit (see Appendix A for details). (Bottom) Rm = <Dst>τ/<Dst>1year is shown as a function of log10(τ). The green line shows the result for all the data (from 1957 to 2016); the black line is best polynomial fit (see Appendix A for details).
Details are in the caption following the image
Same as Figure 9 for a model X based on Weibull distributions and a sixth-order polynomial fit to the variance of Dst′, v(τ). Note that by only considering the negative Dst values (Dst′) the mean values of the fitted distributions are Rm(τ) rather than unity and pdfs have also been multiplied by fneg to allow for existence of positive values In both cases, the values used here are from the polynomial fits shown in Figure 14. In Figure 15b the green line shows v(τ) for Dst′ (on a logarithmic scale), and the black line is the polynomial fit (see Appendix A for the polynomial coefficients and formulae for the Weibull distribution family). Also shown are the v(τ) variations for other variables using the same color scheme as used in Figures 4b, 5b and 6b: Pα (in blue); ap (in red); random variable, XR (in mauve). pdf = probability density function.

Figure 16 corresponds to Figure 12 and shows how the model can reproduce the occurrence of Dst below its overall 95th percentile value (Dsto = −55.142 nT), as a function of the annual mean value. Figure 16b shows the family of such variations for different values of τ.

Details are in the caption following the image
Same as Figure 12 for predictions by the model fit to the Dst distributions with τ shown in Figure 15. (a) The points show the observed probability that 1-hr values of Dst are in the top 5% of the overall distribution of Dst disturbance levels (for 1957–2016, 525960 samples), f[Dst<Dsto] (i.e., Dst is less than its 5th percentile of 1-hourly values, Dsto = −55.14 nT), as a function of the annual mean value of Dst values <Dst>τ=1year. The mauve line is the model prediction for τ = 1 hr. (b). The family of model predictions of f[Dst<Dsto] as a function of <Dst>τ=1year for timescales τ of 1 hr (in mauve), 1 day (in blue), 7 days (in orange), and 27 days (in black).

3 Discussion and Conclusions

It is noticeable that the log10(v)-log10(τ) variation for ap (in red in Figure 9b) flattens off as averaging timescale τ falls below about 1 day, whereas the variance v continues to rise with decreasing τ for power input into the magnetosphere, Pα (in blue). Using a synthesized random time series and a filter, we have demonstrated how the flattening off is caused by autocorrelation in the time series. Hence, there is autocorrelation in the ap time series at τ between 3 hr and 1 day that is greater than that in Pα. As Pα is the driver of ap, this means that the geomagnetic response seen in ap is a smoothed response. This is not surprising, given the currents that the index is sensitive to and their associated time constants. The ap index is primarily influenced by the substorm current wedge (Lockwood, 2013) that is initiated only after a substorm growth phase lasting typically 30–40 min. Hence, the rapid variations in the energy input into the magnetosphere, which are mainly associated with IMF orientation changes, are smoothed as energy (and open magnetic flux) is accumulated in the tail.

The same effect is even clearer for Dst, for which v flattens off as τ falls below about 3 days (the green line in Figure 15b which is again compared to the behavior for Pα in blue). Hence, the smoothing effect on the response of Dst has a longer time constant than that for ap. The (negative) Dst index is responding primarily to the ring current (Turner et al., 2000) that shows greater time constants, responding to the integral of solar wind forcing on timescales of order of a day or more (Borovsky, 2017; Lockwood et al., 2016; note that below we discuss the implications of the fact that even large negative Dst can be influenced by other factors, in particular, the magnetopause currents). This is not to say that Pα is the best coupling function explaining the solar wind influence on the ring current, not least because the coupling exponent α has been tuned to 0.44 to make Pα reproduce ap, not Dst. Nevertheless, the importance of southward IMF in driving disturbed Dst means that the same conclusions would be valid for any other coupling function that might better predict Dst.

Breaking down the power input into the magnetosphere Pα into its component factors, Paper 2 showed that the factors dependent on solar wind velocity and mass flux and on the IMF (FV, FN, and FB) do not vary much on short timescales and the distribution of power input into the magnetosphere is set by the variation in the IMF orientation factor Fθ that, although it can stay stable for several days, is typically changing on minute timescales. Thus, the shape of distribution is set by Fθ, at very short timescales, much shorter than the timescale of the geomagnetic index response—it then evolves with τ according to the CLT, making the shape of the distribution a function of τ only.

A climatology is a statistical description that would enable us to evaluate the probability of space weather events of a given magnitude, and we are working toward one that applies to the full range of solar conditions from grand solar minimum to grand solar maximum. In particular, there is value in knowing the integrated level of activity over an extended period τ, which equals the average value times the duration. Hence, we investigate algorithms that can give us the probability of a given average value for a given τ. These algorithms will be of great value in generating a long-term climatology because they can compute the probabilities for a given annual mean and we have annual means from the past 400 years from recent modeling work based on telescopic sunspot observations (Owens et al., 2017). The approach outlined in this paper is based on the finding that the shape of the distribution of the normalized values (normalized by dividing by the annual mean value) only depends on the averaging timescale τ. This was used by Lockwood, Owens, et al. (2018) to look at the occurrence of large events (defined as in the top 5% since records began) over 400 years. The constancy of the shape of the distributions was just taken by Lockwood, Owens, et al. (2018) as an empirical observation that could be exploited. The present series of three papers provide greater understanding of why this empirical result applies and why the distributions have the form that they do. This is important because it means the result can be applied with greater confidence to periods when inferences is only made from proxy data and, in particular, to grand minima like the Maunder minimum.

We have developed methods that enable computation of the core distribution of both the ap and (negative) Dst geomagnetic indices for a given annual mean value at a required averaging timescale τ. The algorithms for doing this are detailed in sections A1. and A2., respectively. The complications caused by the fact that the Dst index, unlike ap, does not tend to zero when activity is quiet have led to the algorithm for Dst being somewhat more involved than that for ap, and the distributions are best fitted with a Weibull family of distributions, as opposed to the lognormal family for ap.

The model distributions for the ap index make use of the lognormal form which, as shown in the supporting information, gives the best MLE fit of all the distribution forms with two free parameters. The Burr distribution gives slightly better fits according to the absolute goodness-of-fit metrics (least squares and modified Kolmogorov-Smirnov), but the relative metrics that allow for the degrees of freedom (Akaike information criterion, AIC and Bayesian information criterion, BIC) show that the extra degree of freedom is not justified. (Note that as τ approaches 1 year and the observed distribution tends toward a Gaussian, all the distributions are good fits and differences are minimal). Thus, there is no question that the ap model employs the best form of distribution (i.e., the lognormal). The model is also relatively straightforward because the ap index is unipolar and tends to zero at the quietest activity levels. The largest uncertainty in using the model in even the Maunder minimum relates to the occurrence of CIRs and recurrent disturbances that may influence the model at averaging timescales τ greater than about 25 days.

For the Dst model these considerations are less straightforward. First, the Weibull, Gamma, and log-logistic distributions all perform similarly, and none of them are ideal fits to the observed distribution. Furthermore, the extra degree of freedom of the Burr distribution gives fits that are better by a statistically significant degree. This means the added complexity of using two shape parameters (in addition to the mean m = 1) would be worthwhile. However, at this point it is worth remembering that the Dst index is intrinsically and imperfect metric, and hence, the additional fit accuracy is unlikely to justify the additional complexity. Hence, we propose, in a later paper, to generate a model for the pressure-corrected index Dst*. Because Dst* can, like Dst, have both positive and negative values and approach similar to that adopted here for Dst will be needed.

Acknowledgments

The authors are grateful to the staff of Space Physics Data Facility, NASA/Goddard Space Flight Centre, who prepared and made available the OMNI2 data set used. The data were downloaded from http://omniweb.gsfc.nasa.gov/ow.html. They are also grateful to the staff of GeoForschungsZentrum (GFZ) Potsdam, Adolf-Schmidt-Observatorium für Geomagnetismus, Niemegk, Germany who generate the ap data. The ap and aa data were downloaded from the UK Space Science Data Centre from https://www.ukssdc.ac.uk/ with updating of recent data from BGS Edinburgh http://www.geomag.bgs.ac.uk/data_service/data/magnetic_indices/apindex.html. The international sunspot data were compiled and made available by WDC-SILSO, Royal Observatory of Belgium, Brussels http://www.sidc.be/silso/. The work presented in this paper is supported by STFC consolidated grant ST/M000885/1, the work of ML and MJO is also supported by the SWIGS NERC Directed Highlight Topic grant NE/P016928/1/ and of OA by NERC grant NE/P017274/1. S.B. is supported by NERC as part of the SCENARIO Doctoral Training Partnership NE/L002566/1.

    Appendix A: Probability Distributions of ap and Dst

    In the paper, we make use of two distribution forms, the lognormal and the Weibull

    A1. The Equations of the Lognormal Distribution

    For the lognormal distribution the two parameters that are usually used to specify the distribution are μ and σ. These are, respectively, the mean and standard deviation of the normal distribution in logn(x) where x is the variable that is lognormally distributed. These are related to the mean m and variance v of x by
    urn:x-wiley:15427390:media:swe20803:swe20803-math-0001(A1)
    urn:x-wiley:15427390:media:swe20803:swe20803-math-0002(A2)
    or conversely expressing μ and σ in terms of m and v we have
    urn:x-wiley:15427390:media:swe20803:swe20803-math-0003(A3)
    urn:x-wiley:15427390:media:swe20803:swe20803-math-0004(A4)
    Hence, specifying a lognormal distribution using μ and σ is precisely the same as specifying it using m and v. The advantage of using μ and σ is that the equation for the probability distribution of a lognormal is simpler:
    urn:x-wiley:15427390:media:swe20803:swe20803-math-0005(A5)

    For any one combination of m and v, we compute μ and σ using equations A3 and A4 and hence determine the full distribution using A5.

    A2. The Equations for a Weibull Distribution

    For Weibull distribution (also called the Rosin Rammler distribution), the two parameters used to describe the distribution are a scale parameter λ and a shape parameter k. (Note that both λ and k are always positive).

    The mean and variance of the distribution in x are again m and v, where
    urn:x-wiley:15427390:media:swe20803:swe20803-math-0006(A6)
    urn:x-wiley:15427390:media:swe20803:swe20803-math-0007(A7)
    where Γ is a gamma function. The converse equations for λ and k cannot be derived analytically and we solve them iteratively by varying the shape parameter k until
    urn:x-wiley:15427390:media:swe20803:swe20803-math-0008(A8)
    and
    urn:x-wiley:15427390:media:swe20803:swe20803-math-0009(A9)
    and then checking the full range of allowed k for a given v and m that the solution is unique.
    The Weibull distribution is
    urn:x-wiley:15427390:media:swe20803:swe20803-math-0010(A10)

    Hence, as for the lognormal, the distribution is described by two parameters (μ and σ for a lognormal and k and λ for a Weibull) and in both cases specifying that pair is fully equivalent to specifying the mean and the variance. Note that in the paper we fit variables of the form X/<X> and so the mean value is m = 1 and the one fit variable is the variance v. The remainder of this Appendix gives the models used to generate the probability distribution functions, as a function of averaging timescale, τ, for the ap and Dst geomagnetic indices, shown in Figures 9 and 15, respectively.

    A3. Model for ap

    The polynomial fit to the variation of the logarithm of the variance, v, with timescale τ for the ap index, shown by the black line in Figure 9b, gives
    urn:x-wiley:15427390:media:swe20803:swe20803-math-0011(A11)
    such that the model variance is
    urn:x-wiley:15427390:media:swe20803:swe20803-math-0012(A12)
    By normalizing the ap values by the annual mean <ap>τ/<ap>τ=1 year, the annual distributions have a mean m = 1 at all τ.
    For ap the best fit is with the family of lognormal distributions.
    urn:x-wiley:15427390:media:swe20803:swe20803-math-0013(A13)
    urn:x-wiley:15427390:media:swe20803:swe20803-math-0014(A14)
    urn:x-wiley:15427390:media:swe20803:swe20803-math-0015(A15)
    urn:x-wiley:15427390:media:swe20803:swe20803-math-0016(A16)
    urn:x-wiley:15427390:media:swe20803:swe20803-math-0017(A17)
    urn:x-wiley:15427390:media:swe20803:swe20803-math-0018(A18)
    urn:x-wiley:15427390:media:swe20803:swe20803-math-0019(A19)
    The equations A11A19 allow the computation of the p.d.f. f for a value of ap for an averaging timescale τ, <ap>τ, if we know its annual mean, <ap>τ=1year.

    Comparison of Figures 5a and 9a of the main text demonstrates the fit of the family of distributions to the ap data.

    A4. Model for Dst

    The polynomial fit to the variation of the logarithm of the variance, v, with timescale τ for the Dst index, shown by the black line in Figure 15b, gives
    urn:x-wiley:15427390:media:swe20803:swe20803-math-0020(A20)
    such that the model variance is
    urn:x-wiley:15427390:media:swe20803:swe20803-math-0021(A21)
    The fraction of Dst′ samples (with Dst ≤ 0), as a function of timescale τ, is given by the polynomial (the black line in Figure 14a)
    urn:x-wiley:15427390:media:swe20803:swe20803-math-0022(A22)

    (Note that such a high-order polynomial is needed to capture the observed variation with sufficient accuracy).

    The polynomial fit to the ratio of the means of Dst′ for intervals of length τ, <Dst′>τ (where Dst′ is the subset of Dst values that are negative), and the annual mean of Dst, <Dst>1year given by the black line in Figure 14b is
    urn:x-wiley:15427390:media:swe20803:swe20803-math-0023(A23)
    For Dst′, the best fit is with the family of Weibull distributions, the variance of which is
    urn:x-wiley:15427390:media:swe20803:swe20803-math-0024(A24)
    where Γ is a gamma function. The best method is to find the factor k is by iteration to the value that gives
    urn:x-wiley:15427390:media:swe20803:swe20803-math-0025(A25)
    Note that the mean of the distribution is, unlike for the ap case, not in general unity because of the exclusion of the positive Dst values. Rather, the mean is Rm given by equation A23. This yields
    urn:x-wiley:15427390:media:swe20803:swe20803-math-0026(A26)
    urn:x-wiley:15427390:media:swe20803:swe20803-math-0027(A27)
    urn:x-wiley:15427390:media:swe20803:swe20803-math-0028(A28)
    urn:x-wiley:15427390:media:swe20803:swe20803-math-0029(A29)
    urn:x-wiley:15427390:media:swe20803:swe20803-math-0030(A30)
    urn:x-wiley:15427390:media:swe20803:swe20803-math-0031(A31)

    The normalizing factor fneg (given by equation A22 for a given τ) is needed because the product of the terms a, b, and c gives the p.d.f. of Dst′, but they are only a fraction fneg of the whole Dst sample.

    The equations A10–(A21) allow the computation of the pdf f for a negative value of Dst for an averaging timescale τ, <Dst′>τ, for an annual mean of Dst, <Dst>τ=1year.

    Comparison of Figures 13a and 15a demonstrates the fit of the family of distributions to the Dst data.

    Appendix B: Relationship of Daily Means of aa and ap and Correcting ap

    Figure B1, 1 shows scatterplots of 24-hr, 8-point running means of the ap index (by convention referred to as Ap*) in 3-month intervals a function of the simultaneous corresponding mean of the aa index (Aa*). This plot is restricted to data from between 1932 (the start of the ap index data) and 1956 (inclusive). The end date is because in 1957 there is a calibration error in aa introduced by the move of the Northern Hemisphere aa station from Abinger to Hartland. This has been corrected using the ap index by Lockwood et al. (2014) and Matthes et al. (2017). Hence, it is not appropriate to use data for 1957 and after, either with or without that correction. There is considerable scatter about the trend in Figure B1, 1, much of which is introduced by different annual responses of the two indices associated with the different geographic distribution of stations. Note there are also considerable diurnal differences, but they are averaged out by taking 24-hr means. The relationship between Aa* and Ap* depends on time of year (see Figure B1, 1), and the best fit polynomials to the data for four fraction of year intervals, each covering a quarter of a year and centered on the times of the March equinox, June solstice, September equinox, and December solstice, are
    urn:x-wiley:15427390:media:swe20803:swe20803-math-0032(B1)
    urn:x-wiley:15427390:media:swe20803:swe20803-math-0033(B2)
    urn:x-wiley:15427390:media:swe20803:swe20803-math-0034(B3)
    urn:x-wiley:15427390:media:swe20803:swe20803-math-0035(B4)
    Details are in the caption following the image
    Scatterplots of 24-hr means of the ap geomagnetic index, Ap*, as a function of the corresponding means of the aa index, Aa*, for 1932–1956 (inclusive) for 0.25-year intervals around (a) March equinox; (b) June solstice; (c) September equinox, and (d) December solstice. Black squares are means over Aa* bins 40-nT wide. The solid lines are third order polynomial fits, and the dashed lines are plus and minus the best fit 2-sigma error.
    Details are in the caption following the image
    The Ap* correction factor Cap = (Am*/Ap*) (<ap>/<am>) as a function of the time of year, F, and the ap level (shown here on a logarithmic scale) derived from all the coincident ap and am index data (for 1959–2017, inclusive).
    Details are in the caption following the image
    The effect of correcting 24-hr means of the ap index for its dependence on time of year, F: a scatterplot of ApC* (eight-point running means of the corrected apC = ap.Cap) as a function of the corresponding running means of the original ap values, Ap*. The plot is for all ap index data to date (1932–2017, inclusive).

    These polynomial fits and plus and minus their 2-sigma errors are shown in Figure B1, 1 (as solid and dashed lines, respectively). For the estimated Aa* of the Carrington event (Cliver & Svalgaard, 2004), these fits yield Ap* of 275 ± 24, 277 ± 44, 283 ± 30, and 224 ± 33 for the March equinox, June solstice, September equinox, and December solstice data, respectively.

    Our research into the response functions of geomagnetic indices (the collective response of the network of stations used to generate them and of the compilation algorithm used to combine the data from them) using the model of Lockwood, Chambodut, et al. (2018) and Lockwood, Finch, et al. (2018) has shown that the am geomagnetic index has a very flat, almost ideal, time-of-day/time-of-year response. This is achieved because this index employs relatively uniform rings of midlatitude stations in both hemispheres and uses weighted means to account for any spatial nonuniformity of the station network. On the other hand, the compilation of the ap index employs an irregular network of predominantly Northern Hemisphere (mainly European) stations and lookup tables to convert the observations from each into the value that would be seen at the reference Niemegk station before combining them by averaging. The lookup tables are specific to the station location and depend on time of day (UT), time of year (F), and the level of the activity. Cliver and Svalgaard (2004) recognized the value of the am index, compared to indices derived from less-ideal distributions of stations, and used it to correct for the false time-of-day variation in the aa index (and so created what they termed aam). However, they did not correct for the associated spurious time-of-year variation in aa (Lockwood, Finch, et al., 2018) and then used the suggestion of Allen (1982) of 24-hr running means of aam (which they termed Aam*) that largely suppresses the false UT variation anyway. We here apply the same philosophy that Cliver and Svalgaard (2004) adopted but use am to correct for any false time-of-year variation in ap. We do this because the am index data only extends back to 1959, whereas the ap index is available from 1932 onward.

    We have generated a corrected ap index, apC, which allows for effects as a function of the fraction of each year (F) and the ap level using the formula
    urn:x-wiley:15427390:media:swe20803:swe20803-math-0036(B5)
    where the correction factor is given by
    urn:x-wiley:15427390:media:swe20803:swe20803-math-0037(B6)

    The subscript all refers to the averaging of all coincident ap and am data for 1959–2017 (inclusive), and the subscript bin refers to the averaging of data in a given F and ap bin during the same interval. Multiplying by the ratio of the all-over means of ap and am means that we correct for the variation with F but do not change the average levels of ap. In practice, the data were divided into 40 equal-size bins of the overall ap distribution, giving 6,282 samples in each ap bin; the values of Cap(F,ap) were then fitted with a sixth-order polynomial in F. The derived correction factor Cap(F,ap) is shown as a function of F (x axis) and log10(Ap) (y axis) in Figure B1, 2. Note that we are not concerned with any limitations in the UT dependence of the response of ap because we use averages over 24-hr intervals, as discussed below. This correction is only approximate because the network of stations used to generate the ap index has changed several times since 1932. However, we do not find any detectable discontinuities in Cap(F,ap) at any of the changes since 1959 and so we use the assumption that effects of changes before this date also have negligible effect. The effect of the correction is not great (see Figure B1, 3) but is largest for the most active days. Many of these storm day values are hardly altered by the correction but those in Northern Hemisphere winter, in particular, are underestimated in ap, and this is corrected in apC.

    We follow the procedure of Allen (1982) to make 24-hr boxcar means of apC, ApC*. For the purposes of identifying and ranking storm days we take the largest value of the eight such running means in each calendar day [ApC*]MAX. The 100 largest values of [ApC*]MAX since 1932 are given in rank order in Table S7 of the supporting information. Although there are similarities, this list has a somewhat different ranking order to previous studies (e.g., Cliver & Svalgaard, 2004; Kappenman, 2005; Lefèvre et al., 2016; Nevanlinna, 2006, 2008), largely because of the allowance we make for the variation of the ap index response with time of year. Note that even quite small changes in the estimated magnitude of the storm day can have a very large effect on its position in the ranking order.