Data Assimilative Optimization of WSA Source Surface and Interface Radii using Particle Filtering

The Wang‐Sheeley‐Arge (WSA) model estimates solar wind speed and interplanetary magnetic field polarity in the inner heliosphere using global photospheric magnetic field maps. WSA employs the Potential Field Source Surface (PFSS) and Schatten Current Sheet (SCS) models to determine the Sun's global coronal magnetic field. The PFSS and SCS models are connected through two radial parameters, the source surface and interface radii, which specify the overlap region between the inner SCS and outer PFSS models. Though both radii values are adjustable, they have typically been fixed to 2.5 solar radii. Our work highlights how solar wind predictions improve when the radii are allowed to vary over time. Data assimilation using particle filtering (sequential Monte Carlo) is used to infer optimal values over a fixed time window. Solar wind model predictions and satellite observations are compared with a newly developed quality‐of‐agreement prediction metric. The agreement metric between the model and observations is assumed to correspond to the probability of the two key WSA model parameters, the source surface and interface radii, where the highest metric value implies the optimal radii. We find that the optimal particle filter values of solar radii can perform twice as well as standard values for an exploratory period during Carrington Rotation 1901, with these values also reducing nonphysical kinking effects seen in solar magnetic field lines. Data assimilation choices of input realization and time frame have implications for variation in the solar wind over time. We present this work's theoretical context and practical applications for prediction accuracy.


Introduction
As the Sun's magnetic field varies over the solar cycle, so does the behavior and structure of the solar wind. The Schatten Current Sheet (SCS) (Schatten, 1971) and Potential Field Source Surface (PFSS) (Altschuler & Newkirk, 1969;Schatten et al., 1969;Wang & Sheeley, 1992) models describe the global magnetic field configuration of the corona. Determining the parameters used by these models to reflect the structure of the corona accurately, for all times, remains a challenge. Recent studies (Arden et al., 2014;Lee et al., 2011;Nikolic, 2019) suggest that the core parameter of the PFSS model, the source surface radius (R ss ), may vary over the solar cycle. The Wang-Sheeley-Arge (WSA) (Arge & Pizzo, 2000;Arge et al., 2003Arge et al., , 2004 model uses the coupled PFSS and SCS models but elaborates on PFSS with the addition of an interface radius (R i ) parameter . Based on past studies, the model source surface and interface radii parameters have been fixed to have nominal values of approximately 2.5 solar radii (R ⊙ ). Previous work explored the benefit of various data assimilation techniques applied to global solar magnetic maps used as input to WSA (Hickmann et al., 2015(Hickmann et al., , 2016. In the work presented here, the radial model parameters Figure 1. Open magnetic field lines from the Sun's surface are plotted as predicted by the Wang-Sheeley-Arge (WSA) model; the field lines demonstrate significant kinking at the interface. The prediction assumes (R ss , R i ) = (2.51, 2.49), the standard values in WSA. Red and blue lines colors indicate magnetic field polarity. These field lines demonstrate significant kinking, which is physically unlikely. Although the interface radius parameter has been introduced to reduce kinking, its optimal value (and that of the source surface radius) may require tuning to this point in time.
(i.e., the source surface and interface radii) are optimized to improve prediction accuracy. Particle filters, also known as sequential Monte Carlo, have been used in optimizing geophysical data assimilation (Bocquet et al., 2010;Doucet et al., 2000;Gordon et al., 1993; van Leeuwen, 2010). With WSA, particle filters can be used to fine-tune source surface and interface radii, accounting for time-varying behavior. Optimization of WSA radii values is not tied to a specific physical attribute of the corona: We explore a general tuning method to illustrate a path for improved operational forecasting.
Previous work indicates that temporal variation in the source surface is expected to affect the predictive accuracy of PFSS models (Arden et al., 2014;Lee et al., 2011;Nikolic, 2019), including WSA. The WSA model requires that the source surface radius be higher than the interface radius, otherwise a nonphysical gap in the coronal field will result. WSA typically employs (R ss , R i ) = (2.51, 2.49) R ⊙ . This choice roughly agrees with a 57-point search of parameter pairs  that yielded (2.6, 2.3) as the optimum, based on qualitative agreement between the model and spacecraft observations. Analyzing other stretches of the solar cycle can yield different optima, such as a 15-30% increase for R ss seen in Solar Cycles 23 and 24 (Arden et al., 2014) or 1.5 to 1.8 R ⊙ in Solar Cycles 22 and 23 using a coronal-hole metric to quantify accuracy (Lee et al., 2011). Tracing of open magnetic field lines in the coupled PFSS and SCS models, as in Figure  1, shows persistent but well known physically implausible kinking with this choice of radii.  found that adjustment of the R ss and R i parameters to relieve the nonphysical kinking of the magnetic field lines could result in superior solar wind and interplanetary magnetic field (IMF) predictions. The selection of these parameters is a problem that is amenable to optimization methods. Other model properties, particularly input map selection, are the subject of other ongoing investigations. While work remains to determine the most efficacious performance measurement, a metric has been implemented that allows iterative exploration over possible radii in WSA. This iterative approach converges on the best parameters at a given time and tracks them as they evolve. Operational forecasts can use this particle filter as a basis for adaptive statistical modeling of solar parameters.
Particle filtering yields improved predictions of the solar wind velocity and magnetic field polarity as demonstrated here using past satellite observations. Adaptive estimation of the parameters of the WSA coronal model will empirically substantiate the choice of source surface in prediction models while enhancing space weather forecast accuracy throughout the solar cycle. This paper describes these developments in automated optimization of the source surface and interface radii.
Section 2 reviews global solar magnetic maps, WSA, and statistical inference, and section 3 describes the application of inference methods to space weather forecast's predictive accuracy. Implications and conclusions follow, suggesting how to deploy these methods in an operational pipeline.

Global Solar Magnetic Maps
Dynamic magnetic fields on the solar surface and corona, evolving into topologies with loops and streamers, drive space weather variability. Since measuring coronal magnetic fields is challenging, coronal and heliospheric fields are estimated with models utilizing observations of the photosphere. By assimilating a sequence of full-disk photospheric magnetograms into a global map, solar synoptic and synchronic magnetic maps (e.g., Riley et al., 2014) provide estimates of the magnetic flux distribution. Standard global solar synoptic, also referred to as Carrington (Carrington, 1863), maps assume a fixed mean rotational rate for the sun.

10.1029/2020SW002464
Synoptic maps mix space and time by assimilating and aligning new data as if the sun rotates as a solid body, where each rotation is referred to as a Carrington rotation (CR). To avoid smearing solar magnetic observations when assimilated into global maps, synchronic maps are generated with flux transport models (Feng et al., 2012;Schrijver & DeRosa, 2003;Ulrich & Boyden, 2006;Upton & Hathaway, 2013;Worden & Harvey, 2000). These models have been developed to compensate for the spatial mixing challenge by accounting for rotational, meridional, and supergranular diffusive transport processes, where measurements are not available.
The synchronic maps used in this study are generated by the Air Force Data Assimilative Photospheric Flux Transport (ADAPT) model Hickmann et al., 2015Hickmann et al., , 2016. ADAPT incorporates photospheric magnetic flux transport (Worden & Harvey, 2000) with data assimilation to improve estimates of the global solar magnetic field. For this study, the ADAPT global magnetic maps are generated from magnetograms observed by the Kitt Peak Vacuum Telescope (KPVT) (Livingston et al., 1976). This study focuses on CRs 1901and 1902(beginning 29 September 1995ending 23 November 1995), as was done by McGregor et al.2008, wherein a 57-point grid search found optimal (R ss , R i ) = (2.6, 2.3). The selection of KPVT photospheric observations offers the option of extending this study to a much longer time period in future work. Uncertainties in the synoptic flux map can be addressed in several ways (e.g., Pevtsov et al., 2015), of which ADAPT is one approach. Rather than a single estimate of the global flux distribution, the ADAPT model generates an ensemble of realizations (e.g., 12 realizations were used for this study) driven by different supergranulation flow patterns. Figure 2 highlights the model polar mean value estimates (i.e., latitudes 70 • to 90 • ) for each ADAPT global map realization within the ensemble, illustrating the uncertainty from having poorly observed poles.
Global solar magnetic maps are key for forecasting the heliospheric magnetic field and solar wind velocity within the solar system. In this paper, the terms forecast and prediction are used interchangeably to indicate inference from existing data onto a future condition. Being able to predict space weather conditions carries significant, economic consequences (Eastwood et al., 2017;MacAlester & Murtagh, 2014;Owens et al., 2014) for satellite operations and terrestrial infrastructure. Predictive accuracy sets the worth of forecasting models. Attaining this accuracy does not demand a complete, high-fidelity physical model of the solar environment (e.g., magnetohydrodynamic simulations Mikić et al., 1999). A forecast model is not useful when the forecast computation is slower than real-time. Simplifications can be justified when they expedite computing, even at some cost in accuracy. The WSA model detailed in section 2.2 makes such simplifying assumptions. As methods to improve accuracy become available, such as particle filtering (section 2.3), they can be implemented to the extent allowed by computational resources.

WSA Model and Prediction Metric
The WSA model (Arge et al., 2003(Arge et al., , 2004 builds on the PFSS model (Schatten et al., 1969;Schatten, 1971) assumption of a source surface radius (R ss ) (in the original, a zero-potential surface) at which open magnetic field lines become radial. Extension of the WSA model  incorporates an interface surface at radius (R i ) (historically itself sometimes referred to as a source surface Schatten, 1971), which helps smooth the transition of the field from the PFSS to SCS models, mitigating the kinking, where the values of (R ss , R i ) are configurable input parameters. The WSA model inner boundary utilizes the photospheric magnetic flux estimates from the ADAPT ensemble, where different ADAPT realizations are handled independently. The magnetic field estimates near the sun for a given source and interface radius are used to predict solar wind parameters at the L1 Lagrange Point for comparison with the WIND satellite (Ogilvie et al., 1995). Although several metrics have been used (Arden et al., 2014;Lee et al., 2011;MacNeice, 2009a;Norquist & Meeks, 2010;Norquist, 2013;Owens et al., 2014), we evaluate model performance according to the scalar prediction metric H, computed from the residuals between the model predictions and the observed solar wind speed and IMF polarity values.
For this study, WSA models were run for one ADAPT map per day. Each model can generate predictions from a = 1 − 7 days into the future. WSA tends to be most accurate 3 days into the future (Owens et al., 2014), likely due to the transit time for the solar wind to propagate from the model's outer boundary to the satellite at L1, so we choose to work with the a = 3 prediction set.
A vector, t p , represents the list of all the times t p at which the WSA model outputs predictions, which depend on solar wind arrival time. Vector notation is used to reduce the number of subscripts; no linear algebra operations are involved. If multiple input maps are analyzed, with start day t 0 and end day t f , with day  23 November 199523 November (i.e., 199523 November .74 to 1995. This period in Carrington Rotations 1901 and1902 is near the Solar Cycle 22 minimum. number indexed by m ranging from 0 to f , each input map will generate a set of predictions output at times a days in advance. For a single map m on date t m , this set of predictions will be a subset of all the predictions, represented as the vector t p,m = [ t m + a, … , t m + a + 1 ] . The entire set of prediction times t p is the union of these vectors t p,m . It contains where t m is time in days and the union symbol indicates vector concatenation. Satellite observation times t s have an independent cadence. In order to make a quantitative comparison, we must establish a vector t c onto which we interpolate predicted and observed values. This common baseline vector t c will have a fixed time step, conveniently indexed by i such that an element (t c ) i = (min t p + iΔt). We choose t c to have a time step Δt determined by the CR duration T carr = 27.2753 days and ADAPT map resolution res (typically 2 • ): For the WIND satellite, the observation time step is approximately hourly, so the common baseline is downsampled. This common baseline reconciles the asynchronous nature of the WIND observation times, which depend on satellite operations, and the WSA prediction times, which depend instead on the modeled arrival time of the solar wind. These prediction times cannot be synced to the observation times a priori, because they depend on the model physics. Therefore, the common baseline is the most practical option.
The WSA model can predict both the solar wind speed and the local polarity of the magnetic field (e.g., radially outward directed field has a polarity of = +1), and the WSA prediction metric H used here takes accuracy in both predictions into account.
The polarity fraction < > is the proportion of "correct" predictions of magnetic field polarity . It is constructed from the interpolated time series t c with elements indexed by i. In the interpolation, polarity values are first linearly interpolated then rounded to the nearest integer, because polarity is an integer quantity. The satellite observations of polarity, s , contains values of +1 or −1; the vector of the satellite polarity time series is ( s ) i . The WSA polarity prediction p can have values of +1, −1, or can be 0 when the polarity is indeterminate; the vector of the WSA polarity prediction time series is ( p ) i . "Correct" polarity predictions are those that match the satellite observed polarity, although it is conceivable that the satellite data may be incorrect. No effort is made to account for uncertainty in the satellite data. Where arrow brackets denote averaging and vertical bars the absolute value, the average correct polarity fraction is a rational-valued quantity given by Values of < > vary between < >= 1 for perfect correspondence between the predictions and observations and < >= 0 when they are perfectly mismatched. A typical value for preoptimization < > from WSA during the period studied is between 0.6 and 0.8. We also compute the average root-mean-square (RMS) velocity residuals, Δv rms , from the radial velocity vector component as predicted, (v p ) i , and seen by satellite, (v s ) i : Whereas the solar wind averages around 500 km·s −1 (Neugebauer & Snyder, 1966) (typically ranging from 300 to 700 km·s −1 Coles & Maagoe, 1972;Snyder et al., 1963), a typical < v rms >= 100 to 200 km·s −1 is seen in standard WSA predictions during the period studied.
The WSA prediction H metric was developed because it is common in solar wind model validation studies (Arge & Pizzo, 2000;MacNeice, 2009aMacNeice, , 2009bNorquist & Meeks, 2010;Norquist, 2013;Owens et al., 2005Owens et al., , 2008 to compare model predictions of radial solar wind speed with in situ observations and separately compare predictions and observations of IMF polarity. This can be problematic, however, as it is not unusual for solar wind models to predict radial speed well while getting the IMF polarity wrong, which implies the model is incorrectly predicting the source region (i.e., coronal holes) of the solar wind. The strength of the H metric is that it requires models to predict both speed and IMF correctly to obtain a high score. Normalizing H to make this metric dimensionless may be investigated in the future, but doing so would not change the results of this analysis. Specifically, H is the ratio Preoptimization, standard-radii value for the H may range from 0.003 to 0.008 km −1 ·s, as seen in Figure 3. Other epochs may better fit the standard values, up to 0.025 km −1 ·s. Both polarity and solar wind velocity must be predicted correctly to ensure a high metric H. Metric values are broadly consistent across ADAPT realizations of the synoptic map. Figure 3 contrasts two heatmaps of the prediction metric on a fixed 36-point grid of radii values, ranging from 1.5 to 4.0 R ⊙ in both parameters. Each heatmap is derived from a distinct ADAPT "map" (realization). Optimal radii vary between the realizations within about one solar radius.
Inference methods such as particle filtering are adept at honing in on variables such as (R ss , R i ). Small changes in radii can have noticeable impact on H, illustrated in the sensitivity analysis in section 3.1. Therefore, we have chosen to focus on optimization of the radial parameters.
Nevertheless, ADAPT map realization has an important impact on WSA predictive accuracy that merits further exploration. Figure 4 graphs how, for a fixed (R ss , R i ), the H metric score can at times vary by as much as a factor of 2 between ADAPT realizations, although the variation can be of order 30% at other times. In this figure, the ensemble of ADAPT realizations moves collectively as a bundle, especially when the variation is relatively small. This supports our choice to optimize, initially, over a set of ADAPT maps of a given realization (even though it may not be the best ADAPT map realization and (R ss , R i ) combination overall). We do not consider multiple realizations because of the complexity of optimizing over an additional, discrete parameter, as well as the computational cost (see section 3.3).
Using H for inferring an optimum (R ss , R i ) is not, in itself, optimal. The preferred quantity would be a posterior probability of the parameter vector = (R ss , R i ). This posterior probability would be the probability density for the radii having some value, such that integrating over an interval would be the probability of the radii being within that interval. Suppose that it were possible to compute Λ, the likelihood as a function of the satellite data and ADAPT maps, collectively "data" d. The (hypothetical) posterior probability p( |d) would then be proportional to Λ( |d) (for uniform prior probabilities in the radii). Unfortunately, Λ cannot be calculated without an uncertainty or "noise" model of these data. Constructing Λ would call for understanding uncertainties in both the magnetograms and satellite instruments. That task is beyond the scope of this paper. We observe that the partial derivative of H with respect to < > is positive, because < Δv rms > is always positive, so increasing accuracy in polarity prediction improves H. Likewise, the partial derivative of H with respect to < Δv rms > is always negative: The metric worsens with increasing velocity residual and is conversely better with smaller velocity residual. Independently, these derivatives imply that higher H indicates a better fit to the data. Thus, we expect that H grows, approximately monotonically, with the unknown likelihood Λ. We then use H as a proxy for ranking the particle filter.
Generalization of H to multisatellite observations would be feasible given the instrumental uncertainties noted for Λ above. Rigorous uncertainties would also benefit the inference discussed below. Using Δv rms in equation (6) implicitly assumes equally weighted observations with independent, identical Gaussian distributions fit for an L2-norm (Euclidean distance with the Pythagorean theorem). This assumption may not be valid during coronal mass ejections, which could be filtered out (although not done in this work), and is not valid between different satellites.
Measuring WSA performance with H is not fundamental. WSA as a predictive engine is separate from the H computation. If other metrics were to prove superior, WSA-and the particle filtering in section 2.3-could substitute them in place. We proceed to measure accuracy of the solar-wind forecast for a given pair of parameters with the prediction metric that we do have.
The sensitivity of WSA predictions to radii settings will likely vary over the solar cycle due to a variety of factors such as the strength of the polar magnetic fields and magnetic activity. This paper uses a modified version of WSA to support output from a sequential run of varying parameters (R ss , R i ), testing the predictive accuracy of the WSA code using a historical data from a selected time period. Details on the historical data analysis follow in section 3.2.

Particle Filtering
Particle filtering (Bocquet et al., 2010;van Leeuwen, 2010), or sequential Monte Carlo, is an inference method suited to the nonlinear dependence of WSA predictive accuracy on the R ss and R i parameters. It is well-suited to ongoing data assimilation pipelines, as described further in section 2.4.
Addressing the time-varying sensitivity of the model is challenging. A plot of the envelope of forecasts for different radii pairs and two ADAPT maps is shown in Figure 5. Progress in prediction requires implementation of the particle filter. This filter replaces a grid of radii with an adaptive search through parameter values. Running in parallel, the filter enables continual integration of new data to update predictions. The question Figure 5. Plot of solar wind radial velocity (km·s −1 ) at the L1 Lagrange Point as observed by the WIND satellite from 29 September 1995 through 24 November 1995, compared to two ensembles of WSA predictions produced from distinct ADAPT maps. Each color represents a different point in the space of ADAPT maps and (R ss , R i ) points. Polarity aside, the optimal ADAPT map and radial parameter will be that which most closely tracks the evolution of the satellite observations. The closeness of this track is characterized by the prediction metric H.
remains of how long to integrate the data: what should N times be? Because the underlying solar parameters shift, longer N times may obfuscate genuine variations in the radii. We undertake a closer specification of the mathematics of the filter.
The filter is initialized by selecting N sample points from within the range of allowed values for R ss and R i . In WSA, the limits on both R ss and R i are the intervals [1.5, 4.0]. These ranges are open intervals, uniformly-sampled (U) with negligible probability of sampling on the edge. A constraint on WSA  is that R ss > R i . These constraints define the prior ( ) on which to draw (∼) each sample j = (R ss , R i ) j : Each sample j is supplied to the WSA code as one (R ss , R i ) parameter pair. This step can be parallelized to run each j in sequence, although not done here; the WSA code is internally multithreaded. After the runs complete, we calculate the prediction metric H of every j and a normalization constant, C 0 : defining the cumulative "distribution" (true insofar as H ∼ Λ), c j , by summing over the dummy index j ′ : The values of c j monotonically increase on the interval [0, 1], reaching unity when j = N sample .
Resampling defines a particle filter. The resampling phase generates new particles, in iteration k, known as k . For iteration k = 0, we used c 0 = c from equation (12). We first draw u k from the 1-D uniform distribution: intending to sample proportional to H( j ). The objective is to use H as weight in the sense of importance sampling ( Van Leeuwen, 2009). This goal is accomplished by choosing the first c k greater than u k , returning a set of values, denoted q (with a suppressed k index for clarity). The only reason for this complicated process is to support sampling from an arbitrary discrete distribution that cannot be specified directly in a computation, so long as random uniform samples can be drawn. The set of values q are This set of N sample indices, q(j), prepares the next iteration of samples, k+1 . To avoid the collapse of the filter to a small number of samples, we employ perturbation, also known as a regularization (Van Leeuwen, 2009) or resample-move (Doucet & Johansen, 2009) algorithm. We consequently invoke a perturbation kernel, which supplies independent perturbations k , drawn from a zero-mean, standard deviation normal distribution, to explore the space around each sample. The standard deviation could depend on k, as discussed later. In general, is kept the same for both components in = (R ss , R i ). Careful handling of edge effects, discussed

10.1029/2020SW002464
below, should ensure that this Gaussian proposal distribution leaves the unknown target distribution of H invariant. The next generation of particles is then given by Note that if any k+1 fall outside the physical bounds of the system, which require that R min < < R max and R i < R ss , then the prediction metric there is axiomatically H( ) = 0. Therefore, that sample will not be resampled in subsequent generations.
Edge effects are the rationale for this choice. Such effects complicate sampling close to parameter-space boundaries: They must be examined empirically. An a priori requirement is that the reverse jump probability be equal from k+1 to k q( ) as in the forward direction. This requirement is achieved in the resampler stage. Ensuring that Δ k is consistent on the edges, this stage "samples" regions outside the space by assigning them a prediction metric H = 0. This metric is formally correct, and by design, that point will not be the origin for resampling in the next iteration.
As k → ∞ on a fixed data set, the particle filter will asymptotically optimize H. Regions of greater H will be sampled at higher frequency. Provided is large enough, a maximum H may be found. Progressive "cooling" of (reducing , e.g., ∝ exp(−k)) will then converge on the highest value. Yet the ideal , as well as N sample , depends on the metric. Because the matrix of partial derivatives H is unknown, a (psuedo-)Fisher metric must be determined from sensitivity studies. This issue is explored in greater detail in section 3.2.
The histogram of the weighted samples of a particle filter will be analyzed later in this paper. In WSA, the parameter space of the particle filter is not uniform. It is triangular because of the R i < R ss constraint, with corners at (R min , R min ), (R max , R min ), and (R max , R max ). Initial sampling (or any case where H is homogeneously uniform) is uniform only in 2-D (note that the initial sampler generates parameters independent in each dimension and then filters according to the constraints until N sample are supplied). In 1-D, the marginal distribution priors are Marginal-distribution bias does not affect 2-D particle filter sampling, but it can affect interpretation of the optimum. It also has no affect on max H, which is dependent only on (R ss , R i ) locally. As data assimilation progresses through further iterations, the distribution should tend to converge to an optimum, in which case the bias is reduced.

Data Assimilation Pipeline
This paper uses WSA to generate solar-wind forecasts using ADAPT global solar magnetic maps as input.
The process of using a particle filter to generate forecasts is an extension of section 2.3. A set of particles k are sampled based on previous measurements and then used to improve the accuracy of predictions for new, incoming data. Since the coronal magnetic topology is continually changing, predictions can only be made for a finite time window. WSA generates predictions ranging from 1 to 7 days ahead. Section 2.2 notes how 3-day advance predictions can be selected; this point in time is with respect to the input ADAPT maps. Data assimilation entails updating the set of k to track new maps, adaptively optimizing (R ss , R i ). When a given window has N it iterations (indexed by k, as defined in equation (16)) of particle filtering, the next window (indexed by l) will begin with that sample set: The first step is to update the time "window" of ADAPT maps, bounded by start t 0 and end t f . Because successive windows are equal-length, the duration w ≡ t f − t 0 . A new set of prediction times t p is generated at each 0,l+1 , per equation (1)

over the new interval
. Typically, each t p has equal N sample from equation (3). Slight variations in WSA solar wind predictions can result in different arrival times at the L1 Lagrange Point, leading to minor fluctuation in the number of samples after bracketing by the number of days in advance, a.
Aside from the specification of the window l, equations (11) through (16) are unaltered. At the end of each window, the samples are propagated to the zeroth iteration of the subsequent window by equation (19), and new ADAPT data are selected. As long as the optimum does not change more rapidly than the perturbation kernel can follow, the particle vector j will track. If the perturbation kernel is too broad, however, or the particle number is too small, the filter may never converge on an optimum. Operational use of the particle filter requires fine-tuning these inference meta-parameters.

Twin testing results: Simulations of predictions as observations
Twin testing is the comparison between a control and experimental group. In WSA, such a test is complicated by several factors. WSA is a simulator, based on ADAPT maps that are themselves statistical realizations of the interpretation of observational data. In the time period of CRs 1901 and 1902, the observational data are magnetograms from KPVT, as explained in section 2.1. There is no clearly defined observational "effect" present in an experiment, against which to control: The solar wind and polarity are continuous effects observed with some nonzero value. We are unaware of either theoretical or empirical estimates of the uncertainty in Δv rms or . Moreover, the ADAPT realizations may impart their own systematic bias on the results; this possibility is highlighted by the slight differences between H(R ss , R i ) in Figure 3. These issues are mitigated by using WSA to simulate itself-although sensitivity to model-form uncertainty cannot be probed, it is possible to characterize the behavior of the particle filter and its prediction metric H. For reasons of computational cost, this study focuses solely on a single, arbitrarily-chosen ADAPT realization (Number 5 of 12).
Two stages are involved. Part of twin testing is the sensitivity analysis in Table 1, discussed below first. The second part is the particle filtering on simulation in Figure 6, using the same analysis configuration as for Figure 7; this part will be described after the discussion of the sensitivity analysis.
In both cases of the twin test, ADAPT Realization 5 was used to simulate the solar wind and polarity at a specific radii pair, predicting the 3-day advance forecast values at the position of the WIND satellite over a span of WSA dates (9989 + 3, 10046 + 3). These dates correspond to 1 October 1995 through 27 November 1995. In the sensitivity test, (3.0, 2.9) was the simulated (R ss , R i ) point; in the particle filtering test, the point was (2.6, 2.3). For each test, WSA was rerun, but using the simulation (instead of actual WIND observations) for calculating the H metric. Table 1 records H values as measured for a 7-day window, with one iteration and no particle filtering (i.e., no resampling-exact parameters). The simulated value had an H undefined, "infinite" H, as expected from the difference between the simulation and reprediction at the same point causing < Δv rms >→ 0. Varying (R ss , R i ) simultaneously in half-order-of-magnitude increments qualified how quickly H decreased. A smaller number of order-of-magnitude increment variations explored the sensitivity in R ss and R i independently. Finally, several order-of-magnitude increment variations probed around (3.2, 2.3), far from the simulated value. This value was chosen for its proximity to a local optimum in particle filtering tests.
The results of the sensitivity analysis show that variations within about Δ < 10 −4 R ⊙ are distinctive in having H two orders of magnitude higher than variations of Δ > 10 −3 R ⊙ or greater. This height is in the context of H, theoretically expected to diverge to infinity. An additional, second falloff appears for Δ > 10 −2 R ⊙ , losing another order of magnitude before falling to a background level of about H = 3 × 10 −2 km −1 ·s. Nonsmoothness, as exhibited in these tests, makes particle-filtering convergence more difficult, because it is not straightforward to determine the location of the optimum from the relative weights of H in each sample. Naïvely, either a large number of samples, many iterations, or both would be required to come close to the optimum by chance before convergence. In the more optimistic case, assuming that the filter only needed to land on the second plateau around 10 −2 R ⊙ , the 2-D space implies N sample ≈ (2.5∕0.01) 2 samples might be needed. In practice, approximately 100× fewer samples sufficed to detect the optimum (in the twin test and real data). Long-range correlations in the parameter space, not evident in the sensitivity study, seem responsible for this fortuitous outcome.
Particle filtering in Figure 6 shows the convergence of the samples toward the simulated value of (2.6, 2.3), which is marked with red lines. The samples become increasingly concentrated with each resampling, and the 2D density is maximal at the simulated value. The 1D density also has a mode (maximum) on or near this value, but the expectation value (mean) of the histogram is not the same. This discrepancy is the systematic "bias" induced by the prior (equations (17) and (18)). The bias is nevertheless correct for a triangular parameter space of uniform density; the expected values are mathematically valid. Bias diminishes with each iteration, as the resampling is informed less by the prior and progressively more by the assimilated data. In the third window, a long tail along the R i dimension is seen, despite a relatively smaller range of supported R ss . Small decreases in R ss appear to correlate with large increases in R i . This behavior implies an interface radius can compensate for minor increases in source surface radius. The ability of the particle filter to converge on the simulated value verifies that it is integrating the information provided by WSA through the prediction metric.
Twin testing validates the results of Figure 7 by demonstrating, in Figure 6, that an optimum can be identified using particle filtering. Although the real data are significantly noisier than simulated (the optimum is less distinct in the histogram), evidence for the existence of an optimum still appears to emerge. Particle filtering produces evidence that favors particular values of (R ss , R i ) and, using those values, is capable Figure 6. "Twin" experiment heatmaps (left column) and corner plots (right column) at (R ss , R i ) = (2.6, 2.3) (marked by red line in corner plots). Corner plots are weighted histograms of sample data; weighting is done by the prediction metric, and 2-D central histograms are marginalized into 1-D histograms above (R ss ) and to the right (R i ). 512 samples used on 7-day windows on ADAPT Realization 5 for the first (top row), second (middle row), and third (bottom row) weeks of CR 1901, but with simulated solar wind and polarities instead of the WIND observations. No additional iteration between windows. The heatmaps show the raw metric values, with significantly higher resolution due to a higher number of particles over Figure 3. The prediction metric H is itself normalized away in the corner plots. In this experiment, the prediction metric is maximal at the location of the twin experiment simulation. The 2-D corner plot density is also maximal at this location. However, the corner plot 1-D density can be misleading, because of the systematic bias induced by using the triangular prior (see equations (17) and (18)-the bias is mathematically correct for the given space). This bias diminishes over the course of multiple resampling stages. Furthermore, the statistical support for prediction metric away from the simulated value is much lower than the background level in Figure 7. Week 3 (middle left) transitioning into CR 1901 by the end of Weeks 4 (middle right), 5 (lower left), and 6 (lower right). Prediction metric-weighted samples are shown in a 2-D histogram with 1-D marginalizations. The prediction metric H is itself normalized away. No additional iterations between windows. Samples converge over the course of 3 weeks toward a relatively high value. Support is strongest for a 2-D-maximum H value around (R ss , R i ) = (3.5, 3.0), although the data are consistent with a broad range of values. In the first window, the maximum at (3.54, 3.03) has H = 0.010 km −1 ·s, which is 3 standard deviations above the mean H of 0.0045 km −1 ·s. Past-standard parameters of WSA near (2.51, 2.49) are similar in H to the mean. Comparable performance is seen in subsequent windows. This suggests that particle filtering may double the predictive accuracy of WSA.

10.1029/2020SW002464
of enhancing the predictive accuracy of WSA. This capability is tailored to data assimilation, given adequate computational resources.

Analysis of CRs 1901 and 1902
Previous WSA "optimization" by McGregor et al.2008, mentioned in section 2.1, was done visually and not in an objective or rigorous way. One hypothesis may be that this optimum would be rediscovered by the particle filter, with the particle filter recovering higher accuracy and slight variation over time.
From past research (Arden et al., 2014), we would expect (R ss , R i ) variations to be small over the weekly windows w of the ADAPT-maps. Section 2.1 discusses why large variations are believed to have a time scale on the order of the solar cycle. Data assimilation (section 2.4) may yet find that predictions are optimized by relatively large, fast variations of the model radii. Variation is not necessarily an indication of physical changes in the corona. Only with further understanding of systematic error in WSA, and additional observations for comparison, could we examine physical alterations in the coronal magnetic field.

Solar Wind Variation and Particle Tracking
The optimum (R ss , R i ) parameters are those that closest match solar wind velocity and polarity over time. Figure 5 illustrates solar-wind variation for the particle ensembles of two ADAPT maps on CRs 1901 and 1902. The ensemble contains the satellite observations, although no prediction perfectly matches observation. Several features in the figure are reproduced by this ensemble. Examples include the rise in wind velocity around 19 October 1995 (Day 10010) as well as 31 October 1995 (Day 10022); notably, a subset follows the wind in the 10 days before 9 October 1995 (Day 10000), whereas another subset diverges. The closer that the features are tracked, the lower the denominator in equation (7), raising the score H. Graphs of this form-performance versus time-are common in past studies (Arden et al., 2014), along with maps of performance across Carrington longitudes (Lee et al., 2011). For two reasons, we focus and conclude our study with an analysis of H on the 2-D plane of (R ss , R i ). First, Figure 5 is hard to quantify; it is difficult to distinguish whether or not a model accurately captures the observed solar wind. Second, as in the twin test, the 2-D parameter space is the particle filter's domain; time is extraneous. Performance is best quantified in terms of the prediction metric H as a surface on (R ss , R i ) in both real and simulated data.

Particle Filter Prediction Optimization
In Figure 3, two ADAPT realizations are seen to have somewhat different maxima in (R ss , R i ) space. Realizations are currently too costly to optimize, as discussed in section 3.3. Assuming that a single realization is sufficient, we explore the possibility of optimizing this one map. Figure 7 depicts the analysis of ADAPT Realization 5. This realization was chosen as representative out of the 12 available from ADAPT: Results are not necessarily optimal across all realizations.
ADAPT Realization 5 was used from WSA Days 9989 through 10010 (29 September 1995through 20 October 1995. The figure shows the weighted 2-D histogram of samples, multiplied by prediction weights. The best predictions in the first few weeks are obtained for these 7-day windows at about R ss = 3.9 R ⊙ and R i = 3.4 R ⊙ , the values plotted in Figure 8. Later weeks favor source radii around R ss = 3.5 R ⊙ and interface radii around R i = 3.0 R ⊙ , although the interface radius histogram converges slowly. The full width of the distribution at half-maximum is about one solar radius. We do not place precise error bars on these numbers, because of the prior's systematic bias (noted in "twin" testing, Figure 6). The sample density indicates relatively stable parameters over a 7-day windowing time scale. Most significantly, the optimum value performs significantly better than the average of parameters. In Figure 7, the optimum prediction metric H is approximately 2X better than the mean. This implies that particle filtering does improve predictive accuracy.
Experimentation was needed for fine-tuning of the window time w, particle number N sample , and number of iterations N it . WSA generated approximately 7 prediction points per day, so a 7-day window contained ≈ 50 predictions. This number forestalled overfitting to a few prediction points, a concern with shorter window durations. The particle filter number, 512, was decided after experimentation with smaller numbers. While the heatmaps in Figure 3 show some detail with a 36-point grid, the particle filter tends to collapse with so few points. Collapse entails failure to resample from any but the highest-H samples. Figure 7, in contrast, contains some samples outside the highest-H regions. Starting from (100) samples, reliably with a few times that number, the filter sampled the entire parameter space. This is necessary for data assimilation, so the particle filter can track temporal changes in the optimum R ss , R i . For computational cost (section 3.3), N it is only 1 (data is only sampled once). While impeding accurate measurement of detail around the optimum, more samples can be used per iteration. Finally, each resampling used a Gaussian perturbation kernel (see equation (15)) with = ( max − min )∕20.0, which, given bounds of 1.5 to 4.0 R ⊙ , was ≈ 0.125 R ⊙ . This Figure 8. Observation compared to WSA predictions, using both standard (R ss , R i ) = (2.51, 2.49) and particle-filter-optimized radial parameters (R ss , R i ) = (3.9, 3.4), manifested in the solar wind polarity (top) and radial velocity (middle). For these plots, the optimized radial parameters remain fixed, and WSA is run in operational mode. The prediction metric H (bottom) improves by about a factor of 2 during this time span. Each H value is calculated for the 3-day advance predictions, over a 7-day window. A point (H, t) at time t corresponds to the 7-day window that predicts [t, t + 7] days, using ADAPT map input from [t − 3, t + 4] days.
number leveraged between exploring the parameter space and converging around an optimum. Because 0.125 R ⊙ is relatively large compared to the parameter space and the sensitivity seen in Table 1, it may be possible for the filter to converge more fully. Achieving this convergence will require reduction in the perturbation kernel while allowing for movement in the optimum over time. Tuning a time-evolving kernel will require careful further evaluation.
Particle filtering improves the performance of space weather predictions. The solar wind velocity and polarity residuals improve, as visible and quantified in the prediction score, per Figure 8. Roughly, a 2X improvement in score is seen over the standard values motivated by the initial grid investigation during Figure 9. Solar magnetic field lines traced at (R ss , R i ) = (3.50, 2.51), a possible optimum from the particle filter. The transition to radial magnetic field lines is relatively smooth, which indicates that the WSA predictions are more physically self-consistent compared to Figure 1. development of WSA interface radius . Figure 9 shows the improvement in magnetic field lines compared to Figure 1. These field lines are less kinked, which is more physically realistic.

Computational Cost Projection for Operations
The analyses in this paper are performed on a 22-core Intel Xeon E5-2699v4 processor at 2.20 GHz. A total of 88 logical cores are available. WSA simulation code is in Fortran, with multithreading and using the Message Passing Interface; the code is executed with a Python user interface. Results in section 3 have been run in 32-thread mode. The particle filter, written in Python specifically for this paper, wraps around this interface.
The computational cost for the particle filter is far higher than an individual run of WSA. One run requires a duration of T single , where T setup involves reading and resizing (i.e., the code can change the resolution of the original input maps. In this case, they regridded from 1 • to 2 • resolution.) N ADAPT input maps (ADAPT realizations)-several minutes per CR-and T corona , wherein the WSA model computes the coronal magnetic field, propagates the solar wind out to the observation point, and then predicts magnetic polarity and solar wind velocity there. This term can be factored into WSA ≈ 25 ± 5 s and N day number of days to analyze. Then, Generally, T plot is negligible, as the plotting stage is fast. A single CR is analyzed in less than an hour.
A particle filter requires evaluations of the WSA model at many points and is therefore more computationally expensive. The total time, T pf , is mainly dependent on the filter time, T filter : The filter in principle may have N iter iterations per window, where each window has a duration N day . In data assimilation, N iter has been set to 1. An additional, uniform sampling accounts for the initial window. The total number of windows is N window . Each computation must be done N sample times, yielding The need to sample repeatedly imposes upon the particle filter, so a 3-week, 7-day window, 1-iteration analysis with approximately 500 samples can require approximately 4 days of computing time (given the resources available presently). This does not forgo real-time prediction. Critically, the maximum number of samples available for predicting one ADAPT map in real time, N rt , is determined by rearranging equation (23), with an allowed T filter = 24 hr and N window = N day = 1: evaluating to ≈ 1700 for an N iter = 1 with WSA = 25 s. This is an adequate number of particle filtering, although improved WSA would be desirable to allow for more ADAPT maps and iterations to be run. The primary way to improve this WSA would be through additional cores, as the WSA calculations (at a single sample in (R ss , R i ) space) are trivially parallelizable.

Conclusion
The WSA model of the solar coronal magnetic field and wind can be combined with modern dataassimilation methods: Particle filtering has been demonstrated in this paper to improve the performance of its predictions by approximately twofold, as quantified by the prediction metric. Model uncertainty remains unknown, and a number of possible extensions remain untested. In the practical sense, ADAPT map variation is the largest issue impeding further optimization. The particle filter does not currently optimize over ADAPT maps, due to computational cost. When resources become available, this problem can be readily solved within the particle filter, because the WSA computations are inherently parallelizable. Real-time optimization of the WSA model across all ADAPT maps is estimated to be possible with a computing cluster of a few thousand cores. The workings of the filter on the additional dimension of ADAPT realization, however, are yet to be analyzed. Code performance over many Carrington cycles is also yet to be demonstrated. It is unclear whether the particle set, stable over weeks, would endure for years of data. Benchmarks as set here herald further potential.
A crucial direction to investigate would be the addition of more information into H, via more satellites or quantities such as coronal hole area or contained magnetic field flux. Adding this information into H would require characterizing the relative error and uncertainty in each set of observation data, which would also benefit theoretical understanding of the metric. Completely different metrics beside H may prove necessary to capture richer solar dynamics. Deeper physical understanding of the structure of the coronal magnetic field may also be required, as other measurements will capture kinking magnetic field lines differently to the readings from a single satellite.
Space weather predictions have the potential for greater accuracy. The questions raised by examination of the PFSS source surface radius in past papers (Arden et al., 2014;Lee et al., 2011;Nikolic, 2019) have not been definitively answered-rather, a new tool for addressing these questions now exists. Our work enables advanced exploration of these issues using statistical data assimilation. Dynamic, adaptive data forecasting is beginning to be computationally tractable in WSA and may yield accuracy enhancements in forthcoming solar cycles. An evolving corona will become a standard element of space weather forecasting.