The Influence of Solar Wind and Geomagnetic Indices on Lower Band Chorus Emissions in the Inner Magnetosphere

Statistical wave models, describing the distribution of wave amplitudes as a function of location, geomagnetic activity, and other parameters, are needed as the basis to describe the wave‐particle interactions within numerical models of the radiation belts. In this study, we widen the scope of the statistical wave models by investigating which of the solar wind parameters or geomagnetic indices and their time lags have the greatest influence on the amplitudes of lower band chorus (LBC) waves in the inner magnetosphere. The solar wind parameters or geomagnetic indices with the greatest control over the waves were found using the error reduction ratio (ERR) analysis, which plays a key role in system identification modeling techniques. In this application, the LBC magnitudes at different locations are considered as the output data, while the lagged solar wind parameters are the input data. The ERR analysis automatically determines a set of the most influential parameters that explain the variations in the emissions. Both linear and nonlinear applications of the ERR analysis are compared using solar wind inputs and show that the linear ERR analysis can be misleading. The linear results show that the interplanetary magnetic field (IMF) factor has the most influence on at each magnetic local time (MLT) sector. However, the nonlinear ERR analysis shows that the IMF factor coupled with the solar wind velocity has the main contribution to the LBC wave magnitudes. When geomagnetic indices are included as inputs with the solar wind parameters to the nonlinear ERR analysis, the results show that the majority of the variation in emissions may be attributed to the Auroral Electrojet (AE) index. In the dawn sectors between 00 and 12 MLT and 5 < L < 7, the AE index multiplied by the solar wind velocity with zero time lag has the most influence on the amplitudes of LBC. For 5 < L < 7, the parameters with the highest ERR are the AE index multiplied by the solar wind velocity with a 2‐hr time lag at 12–16 MLT, the linear AE index with a 2‐hr time lag at 16–20 MLT, and AE index multiplied by the IMF factor with zero lag at 20–00 MLT. For 4 < L < 5, the parameters with the highest ERR are the AE index multiplied by the solar wind dynamic pressure with zero time lag at 00–04 MLT, the AE index multiplied by the solar wind velocity with zero time lag between 14 and 12 MLT, the AE index multiplied by the solar wind velocity with a 2‐hr time lag at 12–16 MLT, the Dst index with a 6‐hr time lag at 12–16 MLT, and the AE index multiplied by the IMF factor with zero lag at 20–00 MLT.


Introduction
Highly energetic electrons were observed by Van Allen (1959) during the first in situ space radiation measurements, leading to the discovery of the radiation belts. The population of the energetic electrons has been Journal of Geophysical Research: Space Physics 10.1029 shown to vary by several orders of magnitude in a short time (Baker et al., 1986;X. Li et al., 2017;Turner et al., 2013). Besides the enhancement events of energetic electrons, there are also significant dropout events Gao et al., 2015). High fluences of these electrons have been known to cause serious problems to the satellites that transit this region (Welling, 2010;Wrenn, 1995). Electrons with energies from 1 to 100 keV can cause surface charging that interferes with the satellite electronic systems (Mullen et al., 1986;Olsen, 1983), while electrons with energies around 1 MeV and above can cause deep dielectric charging that may permanently damage the materials onboard the satellite (Baker et al., 1987;Gubby & Evans, 2002;Lohmeyer & Cahoy, 2013;Lohmeyer et al., 2015;Wrenn et al., 2002). These problems can range from single event upsets, from which the spacecraft will recover, to the total failure of the satellite (Blake et al., 1992). With prior warning of when these high fluences are expected to occur, it is possible for satellite operators to mitigate some of the damaging effects of these electrons.
To forecast the variations in the electron fluxes, a reliable model of the radiation belt system that can accurately forecast the magnitude of the electron fluxes is required. Data-based models deduced using system science methodologies currently provide very accurate forecasts of the electron fluxes but are limited to regions, such as Geostationary Earth Orbit (GEO), in which there are a large volumes of data from which they can be constructed Boynton et al., 2015Boynton et al., , 2016. Physics-based models, which are partially based on first principles, are able to model the variations throughout the whole radiation belts but are less accurate at present. These models, such as Versatile Electron Radiation Belt (VERB) model (Subbotin et al., 2011) and Comprehensive Inner Magnetosphere-Ionosphere model (Fok et al., 2014), employ numerical codes that involve finding solutions of the diffusion equations. Within these codes, the tensors of the quasilinear diffusion coefficients need to be calculated. Many approaches have been developed to calculate the diffusion coefficients (Albert, 2008;Shprits et al., 2006;Summers, 2005), all of which require models of various waves.
To accurately evaluate these tensor diffusion coefficients for the VERB code, statistical wave models for lower band chorus (LBC) are needed. Chorus emissions are electromagnetic waves found outside the plasmapause near the geomagnetic equator (Burtis & Helliwell, 1969). They are observed in two frequency bands, above and below half the electron gyrofrequency Ω ce (Helliwell, 1967;Tsurutani & Smith, 1974), with upper band chorus 0.5Ω ce < < Ω ce and LBC 0.1Ω ce < < 0.5Ω ce . LBC waves have been shown to modify the local electron distribution function through wave-particle interactions (Thorne et al., 2013), resulting in electron acceleration and also the loss of electrons by pitch angle scattering into the loss cone Bortnik & Thorne, 2007;Mourenas et al., , 2014Shprits et al., 2008). Local acceleration by wave-particle interactions Summers et al., 2002;Thorne et al., 2005) through efficient energy diffusion (Horne & Thorne, 1998) can further energize the seed electrons to highly relativistic energies (Baker & Kanekal, 2008;Thorne, 2010). The effect of upper band chorus waves on energetic electrons has been shown to be significantly less than that for LBC (Haque et al., 2010;Meredith et al., 2001) and so has not been included in this study. Recent studies into chorus waves include three-wave interaction , multiband chorus Gao, Ke, et al., 2017;, and decay process .
Currently, the statistical models of the waves distributions are created using wave measurements from various spacecraft and are parameterized by the location of observations and current values for geomagnetic indices (Agapitov et al., 2011;Aryan et al., 2016;Gao, Li, Thorne, Bortnik, Angelopoulus, Lu, Tao, et al., 2014;Gao, Li, Thorne, Bortnik, Angelopoulus, Lu, Tang, et al., 2014;Gao, Mourenas, et al., 2016;X. Li, Temerin, et al., 2011;Meredith et al., 2001;Pokhotelov et al., 2008). These current models assume that the preceding state of the magnetosphere plays no role in the current wave distribution in the magnetosphere. Moreover, it is known that electron fluxes at GEO are influenced more by changes in the solar wind velocity and density than geomagnetic indices Blake et al., 1997;Boynton et al., 2013;Paulikas & Blake, 1979;Reeves et al., 2011). In addition, the geomagnetic indices may not account for all the variations in the wave intensities and hence their role in the scattering and acceleration of particles. For example, Reeves et al. (2003) found that only half of the geomagnetic storms measured through the Dst index led to an increase in relativistic electrons. Therefore, such parameters that are statistically related to the fluences of electrons should also be included in the development of statistical wave models. The initial problem of developing such a model is to identify the solar wind parameters and geomagnetic indices that have the greatest influence on the wave distribution at a particular location and to determine the time delay between cause and effect. Aryan et al. (2014) parameterized the waves according to time-delayed solar wind variables. This was then extended to multiparameter wave models in further studies by Aryan et al. (2016) and Aryan et al. (2017).
The error reduction ratio (ERR) analysis, which is key in the development of Nonlinear Auto-Regressive Moving Average eXogenous input (NARMAX) models, can solve both these problems. The ERR analysis is able to assess the influence of different inputs with different time lags on the measured output. It was first developed by Billings et al. (1988) in the field of system identification to determine the most influential inputs to a NARMAX model. It has since been employed in a wide range of fields, from modeling the tide in the Venice Lagoon (Wei & Billings, 2006) to analyzing the adaptive changes in the photoreceptors of Drosophila flies Friederich et al. (2009). In the field of space physics, the ERR analysis has been used to develop models for the Dst index Boaghe et al., 2001;Boynton, Balikhin, Billings, Sharma, et al., 2011) and the electron fluxes at GEO (Boynton et al., 2015Wei et al., 2011). Due to the ongoing question of which solar wind-magnetosphere coupling function controls the Dst index, Boynton, Balikhin, Billings, Wei, et al. (2011) employed the ERR analysis to deduce a solar wind-magnetosphere coupling function. The advantage of the ERR analysis is that it can automatically combine inputs, cross coupling them to form a nonlinear function. The technique of employing the ERR to automatically determine the most influential inputs to a system was also applied to a wide range of electron flux energies at GEO Boynton et al., 2013). These studies found that the solar wind density plays a significant role in the dynamics of the high energy electrons (>1 MeV).
The aim of this study is to determine the influential parameters that control the LBC wave amplitude distribution at particular locations using ERR. The ERR analysis is employed to identify these control parameters from a set that includes solar wind variables and geomagnetic indices and also to determine any significant time lags. The first step in this study was to determine which particular locations to use. This is discussed in section 2 along with a description of the instrumentation and data employed for this study. Section 3 gives more detail on the ERR analysis and how it is utilized. The results are presented in section 4 and discussed in section 5. Finally, the study is concluded in section 6.

Data and Instrumentation
The wave data used in this study come from the search coil magnetometer (SCM) instruments onboard the Cluster (Escoubet et al., 1997), Double Star (Liu et al., 2005), and Time History of Events and Macroscale Interactions during Substorms (Angelopoulos, 2008) spacecraft during the periods February 2001 to July 2015, January 2004 to September 2007, and January 2008 to January 2015, respectively. The SpatioTemporal Analysis of Field Fluctuations Spectrum Analyzer (Cornilleau-Wehrlin et al., 1997), onboard both the Cluster and Double Star spacecraft, measured magnetic field oscillations in the frequency range 8 Hz to 4 kHz using 27 logarithmically spaced frequency channels and a sampling rate in the range of 1 to 8 Hz. Time History of Events and Macroscale Interactions during Substorms data come from the SCM (Roux et al., 2008) on satellites A, D, and E. SCM was designed to investigate magnetic field oscillations in the frequency range 0.1 Hz to 4 kHz in six frequency bands (filter bank mode) and sampling rates between 1/16 and 8 Hz. Wave amplitudes measured at frequencies 0.1Ω ce < < 0.5Ω ce were used to obtain the wave power of LBC for each spacecraft in time, L-shell, magnetic local time (MLT), and magnetic latitude. The equatorial electron gyrofrequency calculated using the simple equatorial model (29, 000 * 28∕L 3 ) for LBC waves Meredith et al., 2012). The data are processed in the frequency range from 0.1Ω ce < < 0.5Ω ce equatorial for each L-shell. The SpatioTemporal Analysis of Field Fluctuations and SCM instruments provide specific frequency range and frequency bands. There was no lower limit for the wave power, as the identification period should include intervals of low wave activity as well as high wave amplitudes to accurately train the data.
The solar wind data used for this study were obtained from OMNI website (http://omniweb.gsfc.nasa.gov). The 1-min solar wind velocity, density, and IMF data were then averaged over 1 hr. The AE index and Dst index were obtained from the World Data Center for Geomagnetism, Kyoto (http://wdc.kugi.kyoto-u.ac.jp/index.html).
Here the hourly Dst index was employed as input to the algorithm without modification, while the 1-min AE index data were averaged over 1 hr.

Data Binning
The next step was to determine the spatial resolutions for each of the bins or sectors. This study only considered measurements in the vicinity of the equator by restricting wave measurements in the magnetic latitude range | | < 15 ∘ . The data were separated into two ranges L = 4 − 5 and L = 5 − 7 in the radial direction and into six azimuthal ranges: MLT = 00-04, 04-08, 08-12, 12-16, 16-20, and 20-24. The spatial sizes of these bins were chosen to maximize the amount of data for the ERR analysis.
Once all the spatial resolutions for the bins were determined, a 1-hr resolution time series data set was constructed at each selected location from the data set of LBC wave intensity for each spacecraft in time, L-shell, MLT, and magnetic latitude. With each of the spatial bins, the data point at time t was the maximum wave magnitude between the start of the hour and just before the end of the hour. If no satellite measured the wave magnitude within the spatial bin for time t, then the value was set to not a number, and the ERR analysis would exclude this data point within the algorithm. Since the satellite coverage for the desired spatial bins were sparse, the majority of the data sets was data gaps. Table 1 shows the number of usable LBC data points in each of the bins, that is, the number of hours in which there are measurements out of the total period from February 2001 to July 2015.

NARMAX ERR Algorithm
A single output multi-input NARMAX model can be represented as equation (1) where y at time t is the output parameter that is to be modeled as some nonlinear function, F, of past outputs, past inputs u (where 1 , … , m represent m different inputs), and past error terms e. Here n y , n u 1 , … , n u m and n e are the maximum lags for the output, m inputs, and error terms. The lags of the past outputs, inputs, and error as well as the nonlinear function F are all set by the user of the algorithm. F can be chosen to be any nonlinear function, such as wavelets or radial basis functions, but for this study F was set as a polynomial.
The number of monomials, M, within the polynomial can be calculated from where L is the degree of nonlinearity, 0 = 1 and If equation (1) is set to be a polynomial with a third degree of nonlinearity with six inputs and the number of lags for the output, six inputs, and error terms is set to 10, then there will be 91,881 monomials within the polynomial. These monomials will include linear-, quadratic-, and cubic-coupled terms plus a constant. The vast majority of these monomials will have a negligible influence on the output, and thus, the coefficient attached to these monomials will be 0. The methodology employed for this study is the ERR analysis, which plays a pivotal role in identifying a NARMAX model (Leontaritis & Billings, 1985a, 1985b and is based on the Forward Regression Orthogonal Least Squares (FROLS) algorithm (Billings et al., 1988). If the system has low dimensionality, the majority of the variance of y can be explained by a few monomials, and the FROLS algorithm is able to deduce and rank these significant monomials from the input and output data. This makes the FROLS algorithm highly useful for determining the parameters that influence the system, since with this study, we are not sure which solar wind and geomagnetic conditions result in the growth of waves within the inner magnetosphere.
The FROLS algorithm ranks each candidate monomial by its ERR. The ERR of a monomial represents the proportion (or percentage) of the output variance that is accounted for by that particular monomial. The process that is used to determine the ERR involves an iterative forward regression methodology and proceeds as follows. During the first iteration, the ERR is calculated for each of the i candidate monomials, p i (t), with respect to the output data set, y(t). The candidate monomials, p i (t), consist of the possible linear and nonlinear coupled inputs and past outputs from the polynomial expansion of F. In the first step, the ERR of the i monomial is calculated as The monomial with the highest value of ERR is selected as the first model term, and the remaining monomials are then orthogonalized from p i (t) to w i (t) with respect to the selected monomial q 1 (t) by The orthogonalization allows for the individual contribution of each monomial to be determined. A second iteration is then performed on the remaining orthogonalized monomials, calculating a new set of ERR values, extracting the highest term. The third iteration orthogonalizes the remaining terms with respect to both the first and second monomials identified. In the kth step the remaining monomials, p i (t), are orthogonalized with respect to the selected monomials q 1 (t), q 2 (t), … , q k−1 (t) by where These processes of orthogonalization with respect to the previously determined subspaces, ERR calculation, and term selection continue until the desired number of monomial terms has been selected. With each additional monomial selected, an increasing amount of the variance of the dependant variable is accounted for, that is, the sum of the ERR, and thus, the ratio of error to signal is reduced. The full details of the FROLS algorithm is beyond the scope of this paper, but detailed explanations of the algorithm can be found in Billings et al. (1989) or Boynton, Balikhin, Billings, Wei, et al. (2011).

Application of ERR Analysis
For this study, LBC wave data in a location described in MLT and L-shell are taken as the output data. The ERR analysis was then run for each location bin with the LBC wave amplitudes as the output. The inputs used were initially the solar wind velocity, density, dynamic pressure, and the IMF factor of the coupling function proposed by Balikhin et al. (2010) and Boynton, Balikhin, Billings, Wei, et al. (2011), is the tangential IMF, and = tan −1 (B y ∕B z ) is the clock angle of the IMF). For each of the LBC output data sets (characterized by MLT and L-shell), there are many data gaps because it is impossible for the satellites to monitor each location all the time. As a result, there are very few cases for which there is sufficient data to assess the contribution of the previous output value to the system, that is, if the system has a memory. Therefore, when the previous output value is included in the search, there are very few data points to calculate the ERR ,and the results would not be reliable. Since including past outputs in the initial polynomial would decrease the number of usable points for the FROLS algorithm to train on, as a result, all auto-regressive terms in equation (1) were removed from the search. The error terms were also excluded from the search for the same reason, there would also be too few past error terms, obtained using the past output, to calculate the ERR. This leaves only monomials consisting of the linear and nonlinear combinations of the exogenous inputs to be considered as candidates in the search. For each output data set, the lags were set to be 0, 2, 4, … , 20 hr, while the degree of the polynomial was initially set to be linear to allow for a simpler analysis of the results, and then the complexity was increased to investigate a quadratic degrees of nonlinearity.

LBC Wave Distribution With Linear Solar Wind Parameters
The results for the ERR analysis using a linear polynomial for the function F with solar wind inputs are displayed in Figure 1. The figure shows a polar representation of the inner magnetosphere with L-shell as radial distance and MLT as azimuth. Each spatial bin used in the analysis is delineated by a white boarder. For each individual sector, there are two colors that represent the top two control parameters of the LBC emission according to their ERR. The radial width of each colored segment is proportional to the parameters relative contribution to the emission, that is, if the ERR of the top parameter was 20% and the second parameter was 10%, then the color of the top parameter would be in outer two thirds of the radial distance for that sector, while the color for the second parameter would be in the remaining third. The sum of the ERR of the top two parameters, ∑ ERR 1,2 , is also displayed in each bin as a percentage. Each of the solar wind input parameters is represented by a different color. The solar wind velocity is indicated by red, the density by yellow, the pressure by green,   Figure 1 but with the error reduction ratio algorithm including second degree nonlinear terms. and the IMF factor from the coupling function proposed by Balikhin et al. (2010) and Boynton, Balikhin, Billings, Wei, et al. (2011) is cyan. The effective lag of the control parameter is also depicted where darker colors and more stripes signify a larger time lag. Figure 2 shows the legend for the parameters and lags. Figure 1 shows that a lag of the IMF factor has the highest ERR in all the bins. The IMF factor has zero lag for bins going from 20 MLT anticlockwise to 08 MLT as well as the 16-20 MLT L =4-5 bin. From 08 MLT, continuing anticlockwise, the time lag of the IMF increases to 20 MLT for the outer bins and 16 MLT for the inner bins. Cyan with a stripe represents a 2-hr time lag of IMF from 08 to 12 MLT, two stripes represent a 4-hr lag for 12-16 MLT, and dark cyan with one stripe represents a 10-hr lag for the outer 16-20 MLT sector. This lag on the dayside could be due to the plasmasphere plume reducing in size after geomagnetic storm, and once the plasmasphere has retreated, the LBC are observed.
According to the ERR, the solar wind density has a significant contribution to the LBC waves, apart from the 16-20 MLT outer and inner bins and the 00-04, 08-12, and 12-16 MLT inner bins. Solar wind dynamic pressure is the second parameter for three of the inner bins (00-04 MLT and 08-16 MLT), while the solar wind velocity and a large lag of the IMF factor are the second parameters in the two 16-20 MLT bins. Aryan et al. (2014) also found a relationship between density and LBC; however, the authors found velocity and B s had a more significant influence employing the Kullback-Leibler divergence as a metric (Kullback & Leibler, 1951).
The ERR of a parameter quantifies the proportion of the dependent variable variance of the wave magnitude. Therefore, significant differences in the sum of the ERR of the two parameters, ∑ ERR 1,2 , between each sector should be noted. This is displayed as the numbers in each bin. For instance, ∑ ERR 1,2 in the outer late morning sectors is 15.7%, which is much greater than the outer dawn sector where ∑ ERR 1,2 = 1.8%. The inner sectors generally have a higher ∑ ERR 1,2 than the outer sectors apart from in the late morning sector, where the outer sector has ∑ ERR 1,2 = 15.7% and the inner sector has ∑ ERR 1,2 = 13.8%. Figure 1 effectively shows the variance of LBC waves explained by the linear solar wind parameters; however, the relationship between the solar wind and LBC waves in the inner magnetosphere is very complex and most likely nonlinear. As Boynton, Balikhin, Billings, Wei, et al. (2011) mentioned, applying linear techniques to nonlinear systems can lead to very misleading conclusions. For example, a simple quadratic system, where y = x 2 , if x is zero mean, the correlation coefficient between x and y will be 0. This may lead to the conclusion that x has no relationship with y, even though it is the only input. Therefore, to fully explore the solar wind-LBC relationship, the ERR analysis needs to include nonlinear solar wind parameters. Figure 3 shows the top two quadratic nonlinear solar wind control parameters for LBC waves in each of the 12 sectors analyzed. The colors used to represent the solar wind parameters and their time lags are the same as in Figure 1. To represent the coupled nonlinear solar wind parameters selected by the ERR algorithm, the sectors are divided up azimuthally according to the parameter. For example, in the outer 08-12 MLT sector, the term with the highest ERR is quadratic because it is divided azimuthally into two with red and cyan, both with one strike, which indicates the term to be solar wind velocity multiplied by the IMF factor with a time lag of 2 hr, vB F (t − 2). The second highest ERR is not divided azimuthally, which means the term is linear and is density, n(t). Again, the radial width of each the segments is proportional to the parameters relative ERR contribution to the emission with the sum of the ERR of the top two parameters displayed in each sector as a percentage.

LBC Wave Distribution With Quadratic Solar Wind Parameters
The results of Figure 3 differ from only using linear solar wind parameters. The IMF factor is still a very important parameter, being influential in each of the 12 sectors. However, unlike in the linear case, the solar wind velocity also has a large role when coupled with the IMF factor. A term with solar wind velocity coupled with IMF is present in each of the sectors as first term in all but the two 16-20 MLT bins. As with Figure 1, the velocity coupled IMF parameter has a lag on the dayside bins between 08 and 16 MLT. These results agree with Kullback-Leibler analysis performed by results (Aryan et al., 2014), where they found that velocity and B s both had a significant influence on LBC emissions. The solar wind density is also shown to have an influence in four of the bins and solar wind pressure in one bin.
To ensure the results were consistent, the bins were rotated by 2 hr so that the six azimuthal bins were MLT = 22-02, 02-06, 06-10, 10-14, 14-18, and 18-22. The results are similar to Figure 3 with the solar wind velocity multiplied by the IMF factor selected in all the sectors apart from the inner 14-18 MLT bin, where the term with the highest ERR is solar wind dynamic pressure is multiplied by the IMF factor. The dayside lag that appeared in Figures 1 and 3 also remains with the rotated bins in Figure 4.

LBC Wave Distribution With Quadratic Solar Wind Parameters and Geomagnetic Indices
In most previous studies the statistical models of the waves distributions are parameterized by the location of observations and current values for geomagnetic indices such as the AE index (Agapitov et al., 2011;X. Li, Temerin, et al., 2011;Meredith et al., 2001, Pokhotelov et al., 2008. The results of the solar wind influence on LBC waves using the ERR analysis have shown the main solar wind contributor to be solar wind velocity multiplied by the IMF factor. This is very similar to the solar wind-magnetosphere coupling functions that are often employed to model and forecast geomagnetic indices (Amariutei & Ganushkina, 2012;Borovsky, 2014;Borovsky & Denton, 2014;Boynton, Balikhin, Billings, Wei, et al., 2011;Klimas et al., 1996, Newell et al., 2007. Geomagnetic indices were used as additional inputs to investigate whether they lead to improved models for the amplitude of LBC when compared with those resulting from the use of solar wind parameters alone. The inputs used were the solar wind velocity, density, dynamic pressure, and the IMF factor B F = B T sin 6 ( ∕2), plus the AE and Dst indices. Figure 5 shows the results of the ERR analysis, where as in the previous figures, the solar wind velocity is indicated by red, the density by yellow, the pressure by green, and the IMF factor is cyan, while the Dst index is represented by blue, and the AE index by magenta. Figure 6 shows the legend for the parameters and lags.
The main change between Figure 5 and the previous figures is that AE index has a major contribution to the LBC emissions in all but the inner 16-20 MLT sector where the Dst index is dominant and IMF factor the second. Between 00 and 16 MLT moving anticlockwise, the first term selected by the ERR algorithm contains solar wind velocity (through pressure in the inner 00-04 MLT bin) multiplied by a geomagnetic index (AE index from 00 to 12 MLT and Dst index for the inner 12-16 MLT bin). The two 20-00 MLT bins both have the AE index multiplied by the IMF factor as the term with the highest ERR.

Discussion
The aim of this study was to determine which solar wind and geomagnetic parameters have the greatest influence on the LBC emissions. This knowledge is needed to develop better statistical wave models, which may subsequently be used to evaluate the tensors of the quasilinear diffusion coefficients within electron flux models such as VERB (Subbotin et al., 2011). Currently, statistical wave models only use geomagnetic indices and do not take into account time delays. This study assesses both solar wind and geomagnetic parameters with up to 20 hr of lag, which should better account for the dynamical processes within the outer radiation belt. Therefore, the results of this study will potentially lead to better statistical wave models and an improved understanding and parameterisation of wave-particle interactions, which would result in more realistic models and improved forecasts of electron fluxes in the radiation belts from first principles based tools such as VERB ad CIMI.
The results for LBC emissions are comparable with previous studies that compared wave distributions to geomagnetic indices Aryan et al., 2014;W. Li, Bortnik, et al., 2011;Meredith et al., 2003Meredith et al., , 2012. These results found a strong relationship with geomagnetic indices, while the results from Aryan et al. (2014) showed some dependency with solar wind parameters. Aryan et al. (2014) found that intense LBC occurs at times when the AE index, solar wind velocity, and dynamic pressure are high, the solar wind density is low, and the z component of the IMF is southward. However, identifying the correct set of parameters that control the LBC wave magnitudes is more complex because it is well known that geomagnetic indices, such as the AE index, have a strong relationship with solar wind parameters. The geomagnetic indices are often modeled using inputs composed of variants of solar wind velocity multiplied by a southward IMF factor (Amariutei & Ganushkina, 2012;Klimas et al., 1996) and are often referred to as solar wind-magnetosphere coupling functions Newell et al., 2007). Therefore, high wave intensities during periods of high solar wind velocity may be due to the high solar wind velocity increasing the geomagnetic activity. The ERR is able to separate out the individual dependencies for each of the parameters and assess their contribution. For example, if the AE index is the actual cause of the emission variation and the IMF controls a large proportion of the AE index variation, then the IMF will only contribute to the wave intensities as part of the AE index contribution. However, independently applying correlation test to the solar wind/geomagnetic index variables will indicate a high correlation between the IMF and wave intensity, which is misleading since all the IMF contribution in this example is part of the AE index. The ERR analysis should identify the AE index as the parameter with the strongest relationship with the wave intensity. When searching for the second parameter, the methodology will remove all the IMF contribution associated with the AE index through the orthogonalization discussed in section 3. In this example, the IMF would not be selected as a parameter even though the correlation test may have indicated it had the second highest correlation (after AE index) with the wave intensities.
This study initially investigated the linear solar wind parameter contribution to LBC waves in the inner magnetosphere. This resulted in the IMF factor having the highest ERR in each of the sectors, with solar wind velocity only having a contribution in four of the sectors (acting through the pressure in three of the sectors). However, when allowing for nonlinear quadratic solar wind parameters in the ERR algorithm, the solar wind velocity has a significant role in every bin, mostly when coupled with the IMF factor. This highlights how linear techniques can be misleading when applied to nonlinear systems.
When geomagnetic indices are included in the algorithm in the majority of the sectors, from 00 to 16 MLT, the IMF factor is replaced by the AE index (and the Dst index in the inner 12-16 MLT bin). The IMF factor is no longer included as its contribution to the LBC waves is better represented through the AE index, since it is well known that the IMF plays a large role in geomagnetic storms and substorms through reconnection between the solar wind and magnetosphere (Dungey, 1961). The solar wind velocity contribution cannot be wholly attributed through the AE index as it is coupled with AE index, implying a faster/slower solar wind may enhance/inhibit LBC wave intensities. The importance of solar wind speed could be connected to Corotating Interaction Regions, which are known to lead to an enhancement in the high energy electron fluxes (Miyoshi & Kataoka, 2008). Therefore, solar wind velocity should also be included in the statistical wave models. Similarly, Aryan et al. (2016) found that the combined high AE and high solar wind velocity led to the highest LBC intensities. Therefore, they concluded that AE index alone can underestimate the LBC intensities. A 20-hr time lag of the solar wind velocity is the term with the second highest ERR from 00 to 16 MLT at 5 < L < 7. This could connect to the high energy electron fluxes having a similar time lag with solar wind velocity (Balikhin et al., 2012;X. Li et al., 2005).
The results of the ERR analysis show that the AE index coupled with velocity has a strong relationship with LBC waves in the same locations as the high intensity LBC waves observed by Meredith et al. (2001). Meredith et al. (2001) found that the most intense LBC emissions with amplitudes typically >0.5 mV/m between 23 and 13 MLT with L > 3. This spatial location also corresponds to where the largest sum of the ERR is found, which is logical since if there are larger variations in the signal then the signal to noise ratio (1− ∑ ERR) will be larger. The lower ∑ ERR 1,2 on dusk side, which is observed in each figure, could mean that the results for these sectors are affected by the noise. Therefore, developing the statistical wave models for these sectors would result in large errors. The time lags indicate that these high intensity LBC emissions are generated all across the dawn side of the inner magnetosphere, between 00 and 12 MLT, immediately after substorm activity measured through the AE index multiplied by solar wind velocity. The outer 12-16 MLT sector emissions of LBC occur 2 hr after the activity measured through the AE index multiplied by solar wind velocity.
This study only investigated the first two terms with the highest ERR, since the statistical wave models become increasingly more complex with each additional parameter. Another option in the future would be to develop a Volterra series wave model using the FROLS algorithm, which would consist of the linear and nonlinear combinations of the inputs.
It should be noted that the spatial sizes of each of the sectors were compromised so that there was enough data to perform the ERR analysis. With more data availability of the wave magnitudes, it would be possible to increase the spatial resolution of this type of analysis and perhaps improve the results.

Conclusions
This study has analyzed the solar wind and geomagnetic influences on the LBC waves in the inner magnetosphere. In most of the previous studies, statistical wave models used in numerical diffusion codes have only considered geomagnetic influences, such as the AE index. The results presented in this study show that the AE index controls the largest proportion of the emissions variance through all MLTs between 4 < L < 7 apart from the inner 16-20 MLT bin. However, the solar wind parameters also have a significant contribution to the emissions variance according to the ERR analysis. In all but the 16-20 MLT bin, the term with the highest ERR is a solar wind parameter multiplied by a geomagnetic index. The solar wind velocity has a major influence on the dawn side between 00 and 16 MLT, where it is coupled with the AE index (and Dst index in the inner 12-16 MLT bin). This region between 00 and 16 MLT is where the highest amplitude of LBC waves are observed by Meredith et al. (2001) and also corresponds to where the largest sum of the ERR is found. The lower ∑ ERR 1,2 on the dusk side indicates that the results for these sectors could potentially be influenced by the noise, as the signal to noise ratio is smaller, making it difficult to develop accurate wave models for these sectors.
The statistical wave models that have previously been employed within numerical codes also have no definitive answer for the lag of the geomagnetic indices that should be used to organize models. The results from the ERR analysis have identified the significant lags to use for both geomagnetic indices and solar wind parameters for a wide range of locations in the inner magnetosphere.