Improving Multiday Solar Wind Speed Forecasts
Abstract
We analyze the residual errors for the WangSheeleyArge (WSA) solar wind speed forecasts as a function of the photospheric magnetic field expansion factor (f_{p}) and the minimum separation angle (d) in the photosphere between the footpoints of open field lines and the nearest coronal hole boundary. We find the map of residual speed errors are systematic when examined as a function of f_{p} and d. We use these residual error maps to apply corrections to the model speeds. We test this correction approach using 3day lead time speed forecasts for an entire year of observations and model results. Our methods can readily be applied to develop corrections for the remaining WSA forecast lead times which range from 1 to 7 days in 1day increments. Since the solar wind density, temperature, and the interplanetary magnetic field strength all correlate well with the solar wind speed, the improved accuracy of solar wind speed forecasts enables the production of multiday forecasts of the solar wind density, temperature, pressure, and interplanetary field strength, and geophysical indices. These additional parameters would expand the usefulness of Air Force Data Assimilative Photospheric Flux TransportWSA forecasts for space weather clients.
Key Points

Residual WangSheeleyArge solar wind speed errors are systematic versus both expansion factor and angle to the coronal hole boundary

We improve solar wind speed forecast by correcting the model speeds using these systematic residual speed errors

Improved solar wind speed forecast accuracy can enable multiday solar wind density, temperature, and field strength forecasts
Plain Language Summary
We significantly improve the accuracy of the WangSheeleyArge solar wind speed forecasts by correcting the speeds using the residual speed errors as a function of two quantities. Those two quantities are how close the magnetic field line footpoints are to the coronal hole boundary, and how much the magnetic field lines in the corona bend. Our improved solar wind speed forecasts have the potential to enable multiday forecasts of the solar wind density, temperature, and interplanetary magnetic field strength since these quantities correlate well with the solar wind speed. Forecasts of these additional quantities are needed to improve multiday space weather forecasts.
1 Introduction
The fast solar wind has long been associated with coronal holes, large structures with a unipolar magnetic field polarity that appear dark in EUV images of the Sun (Harvey & Recely, 2002). Coronal holes appear dark in EUV because they typically have a lower density and temperature than the background corona (Harvey & Recely, 2002; Heinemann et al., 2021). The main two potential sources for the slow wind are hot streamers that appear bright in EUV images and the edges of coronal holes (Slemzin & Shugai, 2015; Wang & Ko, 2019). Both the streamers and coronal holes tend to be longlived structures; therefore, a spacecraft in the solar wind will spend days in given a slow wind or fast wind region. Coronal streamers and coronal holes can last longer than a solar rotation (25 days) such that the same structure can be observed several times. When observed in Earth orbit, the solar rotation appears as a 27day periodicity in solar wind observations owing to Earth's orbital motion around the Sun (Bartels, 1934).
Generally, the solar wind speed is the easiest solar wind plasma property to forecast because it has a long autocorrelation time (59 hr) compared to other solar wind and IMF quantities (Borovsky et al., 1998; Elliott et al., 2013; Gosling et al., 1972). Autocorrelation times for the solar wind density and temperature are 16 and 19 hr respectively, and autocorrelation times for the IMF strength and components in GSM Bx, By, and Bz are 20, 29, 14 and 4 hr respectively (Elliott et al., 2013). A long autocorrelation time means that prior measurements contain information that can be used to extrapolate to find future values. Autocorrelation times of 1–2 days imply that it takes several days for a spacecraft to transit wind from a given source region, and that the solar wind plasma properties for a specific type of source region span a unique range of values.
Since the Sun rotates, any structure lasting more than one solar rotation will appear again one rotation later and can be used as a forecast. Owens et al. (2013) showed that such a 27day recurrence model works quite well for the solar wind speed and temperature and the IMF Bx (inward or outward field component) when compared against a statistical climatological model. However, the 27day recurrence model did not work as quite as well for other quantities. A source of error for the 27day recurrence model is that source regions (coronal holes and streamers) evolve over the course of a rotation changing in size and shape causing properties of the emitted wind to vary.
Coronal holes of different sizes can emit fast wind with a range of properties. In particular, larger coronal holes such as the polar coronal holes emit wind in the range from 700 to 860 km s^{−1} (Elliott et al., 2012; McComas et al., 2002). However, such speeds are rare because the more common smaller equatorial coronal holes typically produce wind speeds ranging from 500 to 700 km s^{−1} (Elliott et al., 2012; McComas et al., 2002; Rotter et al., 2012). Similarly, Nolte et al. (1976) demonstrated during the Skylab era that large near equatorial coronal holes produced high speed wind streams, which is also consistent with Ulysses observations indicating that the wind is very fast above the center of large polar coronal holes (McComas et al., 2000). Arge et al. (2003) found that deeper inside a coronal hole and farther away from the coronal hole boundary, the solar wind speed is faster. This finding is consistent with Ulysses observations indicating that the wind is very fast in the center of large polar coronal holes (McComas et al., 2000).
Magnetic field lines derived from photospheric magnetic flux maps are frequently used to produce multiday forecasts of the solar wind speed since the amount of bending or expansion factor of the open field lines correlates with the solar wind speed at 1 au (Wang & Sheeley, 1990, 1992, 1995). This model is referred to as the WangSheeley model. Subsequently, the WangSheeley model was updated and improved by combining Wang and Sheeley's magnetic expansion factor () with the Riley et al. (2001) use of the minimum separation angle (d) between footpoints of open magnetic field lines and the nearest openclosed field line boundary corresponding to the boundary of the coronal hole (Arge & Pizzo, 2000; Arge et al., 2003, 2004; McGregor et al., 2008; McGregor, Hughes, Arge, Owens, Odstrĉil, 2011). The WangSheeleyArge (WSA) model is often used to drive the inner boundary of global 3D MHD simulations, and provides solar wind speed forecasts at the National Oceanic and Atmospheric Administration Space Weather Prediction Center website.
From work by Harvey and Recely (2002) and de Toma (2010), we know that the coronal hole sizes and locations vary over the course of the solar cycle. In particular the predecessors of the polar coronal holes form at midlatitudes and increase in size and move to the poles at solar minimum. The 2010 paper by de Toma found that the polar coronal holes were smaller in cycle 24 compared to those in cycle 23. The WSA model does account for such changes in the location and sizes of coronal holes because the photospheric magnetic flux maps, which are continually updated, are used to estimate the magnetic field expansion factor and location of the footpoints.
In this paper, we describe key aspects of the coupled Air Force Data Assimilative Photospheric Flux Transport (ADAPT)WSA models and solar wind speed observations, examine the residual speed errors as a function of the magnetic field expansion factor and minimum separation angle to the coronal hole boundary, use the residual errors to develop correction array/map for the 3day forecasts, and test the 3day forecasts with corrections applied. In the discussion, we describe how this work can be improved and used to extend the capabilities of the ADAPTWSA models.
2 Methods
2.1 Air Force Data Assimilative Photospheric Flux Transport Model
We use the ADAPT model to produce an ensemble of global solar magnetic field forecasts (i.e., for this study, a set of 12). Flux transport models and assimilation techniques are used to avoid smearing when producing global magnetic flux maps (Feng et al., 2012; Schrijver et al., 2003; Upton & Hathaway, 2014; Worden & Harvey, 2000). Applying flux transport to global maps partially compensates for the spatial mixing challenge by accounting for rotational, meridional, and supergranular diffusive transport processes where measurements are not available. The ADAPT technique improves global solar magnetic maps by incorporating a photospheric magnetic flux transport model with data assimilation methods (Arge et al., 2011, 2013; Henney et al., 2012, 2015; Hickmann et al., 2015, 2016). The ADAPT model evolves the observed solar magnetic flux using relatively well understood transport processes where measurements are not available and then updates the modeled flux with new observations using data assimilation methods (Hickmann et al., 2015). The ADAPT global magnetic maps are publicly available at https://www.nso.edu/data/nispdata/adaptmaps/.
2.2 WangSheeleyArge Model
The WSA model (Arge & Pizzo, 2000; Arge et al., 2003, 2004; McGregor, Hughes, Arge, Owens, Odstrĉil, 2011) is a combined empirical and physicsbased model of the corona and solar wind. It is an improved version of the original Wang and Sheeley model (Wang & Sheeley, 1992, 1995). Groundbased lineofsight observations of the Sun's surface magnetic field are input into WSA as global synoptic maps. These maps are then used in a magnetostatic potential field source surface (PFSS) model (Altschuler & Newkirk, 1969; Schatten et al., 1969; Wang & Sheeley, 1995), which determines the coronal field out to 2.5 solar radii (). The output of the PFSS model serves as input to the Schatten Current Sheet (SCS) model (Schatten, 1971), which provides a more realistic magnetic field topology of the upper corona. Only the innermost portion (i.e., from 2.5 to between 5 and 30 ) of the SCS solution, which actually extends out to infinity, is used. An empirical speed relationship (Arge et al., 2003, 2004) is then used to assign solar wind speed at this outer boundary. The speed is a function of two parameters: (a) the flux tube expansion factor () and (b) the minimum separation angle () in the photosphere between the footpoints of open field lines and the nearest openclosed field line boundary taken to approximate to the coronal hole boundary. For the remainder of the paper, we refer to this separation angle as the angle to the coronal hole boundary. These parameters are determined by starting at the centers of each of the grid point on the outer coronal boundary surface and tracing the magnetic field lines down to their footpoints rooted in the photosphere. The flux tube expansion factors are calculated using the traditional definition (Wang & Sheeley, 1992) where and are the field strengths along each flux tube at the photosphere and the source surface (, respectively. The model provides the source footpoint location, the radial magnetic field, polarity, and the solar wind speed at the outer coronal boundary surface. The field and speed information is then fed into either a simple 1D kinematic propagation code that takes stream dynamic interactions into account in an adhoc fashion (Arge et al., 2004) as we do in this paper, or fed into advanced 3D MHD solar wind propagation models such as the former LFMhelio (Merkin et al., 2016; Pahud et al., 2012) and now Gamera (Zhang et al., 2019), Enlil (named after the Sumerian god of wind and storms) (Lee et al., 2015, 2013; McGregor, Hughes, Arge, Odstrcil, Schwadron, 2011; Odstrĉil et al., 2005), MSFLUKSS (Manoharan et al., 2015) and the EUHFORIA model (Pomoell & Poedts, 2018).
We use the latest version of WSA, v4.2, which is a significantly enhanced and capable version of the model compared to WSA 2.2 in operation at the National Centers for Environmental Prediction since 2011 (Parsons et al., 2011). The code is now parallel (e.g., magnetic field lines are traced in parallel) and fully compatible with the ADAPT global solar magnetic map ensemble. ADAPT model provides 12 simultaneous estimates (or realizations) of the global photospheric magnetic field distribution every 2 hours at resolution. The ADAPT output includes the full ensemble of 12 global magnetic map realizations, each driven by differing random supergranulation flow patterns. While older, nonparallel versions of WSA ran quickly compared to advanced coronal MHD codes, it was necessary to parallelize the model to prevent it from being overwhelmed by the large volume of instantaneous input maps that ADAPT can provide. For instance, to run a 28 days (i.e., about a Carrington rotation) set of ADAPT maps requires processing 4,032 input maps. This is far more than any large scale MHD model can practically process in a reasonable timeframe. The improved WSA 4.2 processing speed making it possible to produce a year's worth of ADAPTWSA solar wind speed forecasts at L1 quickly.
2.3 Data Set and Timeframe
We test the forecasts using the OMNI data set, which is a collection of vetted solar wind observations spanning from 1963 to current time. The normalizations used to combine these datasets is described in King and Papitashvili (2005), and in the online documentation (https://omniweb.gsfc.nasa.gov/). The WSA model is designed to forecast the background solar wind. Throughout 2003 there were many highly geoeffective long duration high speed streams with speeds ranging from 700 to 830 km s^{−1}. Such high speeds can cause the Kp geomagnetic index to reach 6 or 7 (Elliott et al., 2013). In 2003, there was a very large outward polarity polar coronal hole extension from the south pole that reached low latitudes (Elliott et al., 2010, 2012) as shown in Figure 1. This polar coronal hole extension produced wind speeds as fast as 830 km s^{−1} (Elliott et al., 2012). Such high speed wind is rarely observed at low latitudes, but is similar to the fast winds speeds at high latitudes in the center of polar coronal holes observed by Ulysses (Elliott et al., 2012; McComas et al., 2002). In 2003 on the opposite side of the Sun, there was a smaller inward polarity coronal hole that produced wind speeds more typical for equatorial holes ranging from 500 to 700 km s^{−1}. The outward polarity fast solar wind streams and the coronal holes were very longlived. The wind speed emerging from the coronal hole with outward polarity is consistent with the estimates based on the wind speedlatitude relationship derived from Ulysses polar observations shown in light blue in Figure 1 (Elliott et al., 2012; McComas et al., 2000).
We chose 2003 based on our knowledge of the solar wind distribution as a function of time from prior work in Elliott et al. (2012, 2016). To clearly demonstrate why we chose 2003, in Figure 2 we show the total number of valid solar wind speed data points each year in black. Additionally, the total number of valid solar wind speed data points without Interplanetary Coronal Mass Ejections (ICMEs) in gray. The other colored lines show the number of points without ICMEs for given speed ranges with the slow wind (<450 km s^{−1}) in blue, the moderately fast wind (500–700 km s^{−1}) in orange, and the very fast wind (700 and 860 km s^{−1}) associated with polar coronal hole extensions in red. In Section 2.6 we describe how we removed the ICMEs and regions adjacent to the ICMEs to obtain all the data points labeled as nonICME. Year 2003 (vertical dashed line) has a significant amount of data points for all 3 of these speed categories. In 1973 and 1974, there were a significant number of data points in the at high speeds (700–860 km s^{−1}), but there are no magnetic flux maps needed for ADAPTWSA in those years. In 1994 there were a lot of points at high speeds (700–860 km s^{−1}), but the total number of solar wind speed data points was much less than in 2003 (note log scaling). Each year subsequent to 2003 has not had very many nonICME data points at high speeds (700–860 km s^{−1}). If we had not chosen year 2003 as our test year, we would have had to do a much longer test interval spanning many years. Therefore, for this initial study we chose to use year 2003 since this year offers good coverage of the slow, moderately fast, and very fast wind.
2.4 Time Series Model and Data Comparison
To perform an initial overall assessment of how well the forecast reproduces the solar wind from the longlived coronal holes in 2003, we overlay all 12 realizations of the ADAPTWSA speed forecasts on top of the measured speeds. These realizations are a result of using different random initialization of the supergranulation flow patterns. In Figure 3, the 12 realizations of the 3 day lead time speed forecasts are shown as rainbowcolored lines, and the measured speeds are black points. The ICMEs on the Richardson and Cane list (Richardson & Cane, 2004, 2010, 2012; Richardson et al., 2015) are shown as pink horizontal bars, and the bottom panel repeats the solar wind speed measurements colorcoded by the magnetic polarity to enable comparisons with Figure 1. The forecasted peak speeds from the outward polarity polar coronal hole extension are frequently lower than the measured speeds. There are also many instances when the lowest speed intervals have forecasted speeds lower than the measured speeds. To illustrate these points more clearly, we show a zoom of a portion of these results in Figure 4. Based on Figures 3 and 4, it is apparent that WSA does not forecast the speeds well during ICMEs. This is not surprising because WSA forecasts the background wind speed using expansion factor and angle to the openclosed field line (coronal hole) boundary where both of these quantities are derived from global photospheric magnetic flux maps. Therefore, WSA does not model the rapidly evolving, eruptive transient wind.
Throughout this paper we demonstrate our techniques using the 3day lead time results because the ADAPTWSA forecasts are based on magnetic flux maps determined from images of the Sun, and the solar wind emitted from the Sun on average takes 3–4 days to reach Earth. The ADAPTWSA model produces flux maps and speed forecasts for lead times ranging from 1 to 7 days in whole day increments. We found that the 4day lead time results were quite similar to the 3day lead time results shown in this paper. This is consistent with prior testing of WSA by Owens et al. (2013). We anticipate that the shorter 1 and 2day lead times will not produce results that are as good as the 3 and 4day lead times because those lead times would be based on imaging taken after the solar wind material reaching Earth had left the Sun. Similarly, the longer lead times of 5–7 days would be based on imaging taken a few days prior to when the solar wind left the Sun.
2.5 ICME Removal
Since the coupled ADAPTWSA models forecast the background wind speed from longlived solar and coronal structures and not from transients such as ICMEs. Therefore, in our subsequent statistical analysis of the speed errors, we exclude ICMEs. The solar wind properties in regions adjacent to ICMEs such as in the sheaths of ICMEs can be altered by the CME propagation through the background wind. Therefore, we removed ICMEs on the Richardson and Cane ICME list (Richardson & Cane, 2004, 2010, 2012; Richardson et al., 2015), and times within 15 hr prior to the start of the ICME and within 6 hr after the end of the ICME as demonstrated in Elliott et al. (2012, 2013, 2016).
2.6 Residual Error Analysis
After removing the ICMEs, we evaluate the accuracy of the ADAPTWSA 3day lead time speed forecasts. In Figure 5a we show the occurrence frequency colorcoded for given forecasted and measured speed bins. In this format, times when the model and data agree well lie along the diagonal from the bottom left to top right. Many points do lie close to the diagonal, but there are still many instances that do not. The errors are systematic with the speed as revealed in Figure 5b where we show an occurrence plot of the residual (absolute) error versus the measured speed. The histogram of residual errors (Figure 5c) peaks at a value slightly above zero indicating that overall the model predicts lower speeds than the observed speeds.
Therefore, we examined the measured speeds (Figure 6a), model speeds (Figure 6b), and the residual errors (Figure 6c) versus both f_{p} and d. Figure 6b is a visual representation of the WSA formula (Equation 1). By comparing Figures 6a and 6b, it is readily apparent that both the measured speeds and model speeds do show some similar dependences on f_{p} and d. For the purposes of discussion only, we marked 3 zones in f_{p} – d space. In zone 3 at small angles to the boundary, the observations show a dependence on expansion factor not present in the WSA model speeds. In zone 1 at small expansion factor values, both the observed and forecasted speeds are high, and agree fairly well with one another. In zone 2, the measured speeds have a more variable and complex dependence on f_{p} and d than the gradual variations in f_{p} and d found in the forecasted speeds. When we examine the residual errors as a function of f_{p} and d, it becomes clear that the WSA speed formula (Equation 1) could be adjusted to include both an additional expansion factor dependence in zone 3 at small angles to the boundary, and to improve the transition from zone 1 to zone 2 which is sharper in the observations than in the model results.
2.7 Assessment of Corrected Model
Next, we test using the residual errors arrays as a function of f_{p} and d plotted in Figure 6c to correct the WSA forecasted speeds. We use this array plotted in Figure 6c as a correction map to lookup corrections factors that we then apply to WSA model speeds at given modeled f_{p} and d values. Figure 7 is a flow diagram illustrating the steps to determine the corrected forecasts.
Next, we further break down the residual errors for the original WSA speeds and the corrected speeds in Figure 8. The top row of Figure 8 shows the residual errors between the ADAPTWSA 3day speed forecasts and the observed speeds (v_{p}) in three formats: as a function of f_{p} and d, f_{p} and v_{p}, and d and v_{p}. Figure 8a is a repeat of Figure 6c to aid comparisons between the uncorrected and corrected results. The residual errors are quite low for the corrected WSA forecasted speeds when examined as a function of f_{p} and d (Figure 8d). This is reasonable given that the correction map is optimized in f_{p} – d space. However, when we examine the residual errors for the corrected forecast as a function of d versus v_{p} (Figure 8e) and f_{p} versus v_{p} (Figure 8f), we find residual errors that are larger than in the f_{p} versus d plot (Figure 8d). These plots illustrate that positive and negative errors occurring over a range of speeds at given f_{p} or given d values are averaged together in f_{p} – d space producing lower residual errors in f_{p} – d space than in either d – v_{p} or f_{p} – v_{p} space. Therefore, the speeds and residual errors depend other factors besides f_{p} and d. Some other factors might be the uncertainties in the photospheric magnetic field, or there could be solar wind acceleration processes that do not depend on f_{p} or d. By comparing Figures 8b–8e and Figures 8c–8f, it is clear that overall the errors are lower when we apply the correction to the model speeds. The corrected model has lower errors particularly at low and high speeds where the largest errors occurred for the uncorrected model speeds. Residual errors still remain for the corrected speeds, and these errors vary as a function of the solar wind speed.
In Figure 9 we provide a sidebyside comparisons demonstrating how well the ADAPTWSA model works with and without any corrections applied. Note that for comparison purposes the results in Figures 9a and 9b are repeated from Figures 5a and 5c. Occurrence probabilities versus the model and measured speeds for the uncorrected (Figure 9a) and corrected (Figure 9c) results are on the left. Once the correction factors are applied, more points occur along the diagonal of the model versus measured speed occurrence plot. Also, there is a strong correlation between the measured and forecasted speeds. The histogram distribution of residual errors (right) is more narrowly peaked and centered on zero error for the corrected results (Figure 9d) than for the uncorrected results (Figure 9b). For the results shown in Figure 9 we calculated several quantities to assess how well the model and measured speeds agree with one another for the entire year of 2003 (Table 1). We calculated the Pearson correlation coefficient (R_{p}), the ratio of the Pearson correlation coefficient to , which we will refer to as R_{cl}, the Spearman rank correlation coefficient (R_{s}) (Press et al., 1998), the normalized root mean square error (NRMSE) where the mean measured speed is used to normalize the root mean square, the mean residual error, the standard deviation of the residual error, and the number of data points. If there is no correlation between the two quantities at the 95% confidence level then R_{cl} will be ≤1.6 (Bendant & Piersol, 1971; Borovsky et al., 1998; Elliott et al., 2001). Therefore, to be significant R_{cl} needs to be much larger than 1.6. All of the statistical results in Table 1 indicate that applying the error correction maps to ADAPTWSA speeds improves the overall agreement. Both the Pearson and Spearman correlation coefficients are larger for the corrected results. Other indications that the correction maps improve the results are that the Rcl level is higher, the NRMSE is lower, the mean residual error is closer to zero, and distribution of residual errors is narrower since the standard deviation of the residual errors is smaller.
 Note. The quantities in order are the Pearson correlation coefficient (R_{p}), Rcl the ratio of R_{p} to , the Spearman rank correlation coefficient (R_{s}), the normalized root mean square error, the mean residual error, the standard deviation of the residual error and the number of data points. The first column of numbers shows results using the Air Force Data Assimilative Photospheric Flux Transport (ADAPT) WangSheeleyArge (WSA) model speeds without any corrections applied and the second column of numbers shows the results when the residual error maps are used to corrected the ADAPTWSA model speeds. The bold was just to indicate these were labels for the rows and columns.
We did some additional testing to make sure our approach is robust. We determined the residual errors for the last 3 months of the year from 1 October 2003 through the end of 31 December 2003 using the uncorrected WSA forecasted speeds (Figures 10a and 10b). We then used data from the first 9 months of the year 1 January 2003 through the end of 30 September 2003 to construct an error correction map, and then corrected the forecast WSA speeds from the last 3 months of the year from 1 October 2003 through the end of 31 December 2003 using this 9 month correction map. The residual error analysis for these corrected speeds for the last 3 months of the year based on corrections from the first 9 months of the year are shown in Figures 10c and 10d. Lastly in Figures 10e and 10f, we corrected the WSA forecasted speeds for the last 3 months of the year using the error correction maps based on all of the data for 2003 (Figure 6c) as we did in our earlier analysis. By comparing the rows in Figure 10, it is clear that using the first 9 months to correct the last 3 months does produce a more symmetric residual errors histogram centered closer to zero than the uncorrected model, and there are more points occurring along the diagonal of the occurrence versus model and measured speeds. Further improvements are apparent when we use the full year to correct the model results instead of using the last 3 months of the year. For the full year results, the residual error histogram is narrower and more symmetric about zero error, and the occurrence frequencies are higher along the diagonal on the model versus measured speed plot. Table 2 shows the summary statistical results for this same time period shown in Figure 10. These results also support the conclusion that the error correction maps based on the full year improve the results more than the correction maps based on the first 9 months of the year. The errors are smaller, and the correlation coefficients are larger and more significant.
 Note. This table includes results for the uncorrected Air Force Data Assimilative Photospheric Flux TransportWangSheeleyArge model speeds (first column), and the model speeds correcting using maps based on the first 9 months of the year (middle column) and using the maps based on all of 2003 (last column). The bold was just to indicate these were labels for the rows and columns.
In Figure 11 we repeat the analysis from Figure 8 that shows the residual errors in f_{p} – d, d – v_{p} and f_{p} – v_{p} space for the uncorrected and full year corrected forecasts. We add to Figure 11 the same kind of residual error analysis, but now applying the corrections based on the first 9 months. Overall both sets of corrections do significantly reduce the residual errors compared to the uncorrected results. In the f_{p} – d space plots of the residual errors, we find that the residual errors are larger when using the correction factors based on the 9 month corrections (Figure 11d) compared to using the corrections based on the full year (Figure 11g). Also, the errors using the 9 month corrections are still somewhat organized in f_{p} – d space (Figure 11d), and appear to be fainter versions of regions found in uncorrected results (Figure 11a). The overall pattern of the residual errors in d – v_{p} space (Figures 11e and 11h), and in f_{p} – v_{p} space (Figures 11f and 10i) are quite similar when using corrections based on both the first 9 month and full year. There are subtle reductions in the errors in f_{p} – v_{p} space when comparing the errors for the full corrections to the 9 month corrections. These reductions in the errors could reflect that the increased statistical coverage produces a better overall correction, or that including the test time period in the construction of the full corrections improved the results. However, the errors are reduced at large angles to the boundary and low speeds (Figure 11h) when we compare the residual errors for the full year corrected results to the results when the errors when the 9 month corrections are used (Figure 11e). In this case, including the last 3 months of the year for the full corrections added additional coverage at large d angles, and low to moderate speeds, which significantly extended the coverage and improved the correction array. Additional testing using results from multiple years is an area for further research that will be explored, but generally we obtain quite similar results using both sets of corrections. Therefore, our approach shows robustness and promise.
To illustrate what the forecasts look like when applied to a time series of the speeds, we show the uncorrected model speeds (Figure 12a) from Figure 4 along with the 9 month corrected speeds (Figure 12b), and the full year corrected results (Figure 12c). It is clear that there are significant improvements using the corrections at low and high speeds where the uncorrected forecasts were underpredicting speeds. The corrected speeds at times now overpredict the low speeds, and the corrected speeds are more variable than the uncorrected speeds. However, the corrected results show less overprediction and underprediction for the very high speed outward polarity wind emanating from the polar coronal hole extension. Figure 12 also highlights that the speeds with corrections based on the first 9 months of the of 2003 (Figure 12b) are quite similar to speeds based on the full results (Figure 12c). This result supports the robustness of our approach since the forecasts using corrections based on the first 9 months the year are quite similar to those using corrections based on the full year. Improving the forecasts for these very fast streams is important since wind speeds between 700 km s^{−1} and 830 km s^{−1} are highly geoeffective and can produce Kp index values as high as 6 and 7 (Elliott et al., 2013). There are additional improvements that could be made, but overall the residual errors for the corrected models are lower than the those for the uncorrected model as quantified in Figures 911.
3 Discussion
Using the error maps to correct the model speeds significantly reduces the residual speed errors, but this technique does not eliminate all of the errors. Even though the residual speed errors for the corrected speeds are very low in f_{p} – d space (Figure 8d), there are still significant residual errors in f_{p} – v_{p} and d – v_{p} space (Figures 8e and 8f). The low errors in f_{p} – d space reflect positive and negative errors being averaged together and canceling one another out. By comparing the residual speed error maps in f_{p} – v_{p} and d – v_{p} space for the corrected (Figures 8e and 8f) and uncorrected speeds (Figures 8b and 8c), we do find the errors are lower for the corrected speeds than for the uncorrected speeds in both f_{p} – v_{p} and d – v_{p} space. Given there are still significant errors present, and we conclude that not all the speed errors correlate only with f_{p} and d.
We have identified a few potential ways to improve our forecasts. Spikes in our corrected forecasts potentially could be removed by smoothing the correction speed maps. Using more observations to create the correction error maps could also expand the coverage in the expansion factor and angle to the coronal hole boundary space. We plan to automate the optimization of the fit parameters in Equation 1 using both reduce chisquare and pvalue techniques discussed in (Livadiotis, 2007, 2014, 2019, 2020). Our current analysis does indicate there are systematic trends in the residual speed errors that indicate the kind of changes that need to be made to Equation 1. For instance, the model speeds binned in f_{p} – d space shown in Figure 6b is a visual representation of the WSA speed formula in Equation 1 and it does not fully agree with the corresponding measured speeds show in f_{p} – d in Figure 6a. Additionally, Equation 1 could be modified so that there is a sharper reduction in speed going from region 1 to region 2 in the WSA model (Figure 6). In region 3 at small angles to the boundary, the measured speed shows a dependence on expansion factor not present in the WSA model (Figure 6), which could be added to the model by adapting Equation (1) at small angles to the coronal hole boundary.
Gressl et al. (2014) tested several solar wind speed forecasting models including WSAENLIL with a variety of different synoptic maps, and other models (MAS/MAS and MAS/ENLIL). They found that for the year 2007, the forecasted speeds for all the models had correlation coefficients with the solar wind speed observations that ranged from 0.4 to 0.6, but in 2007 the majority of the wind speeds were less than 700 km s^{−1}. Rotter et al. (2012) also tested speed forecasts using empirical relationships between the coronal hole area and the solar wind speed and found correlation coefficients that ranged from 0.69 to 0.77, but for the Rotter et al. study the observations were for 2005 when the solar wind speed measurements were mostly less than 700 km s^{−1}. They included only three points at speeds between 700 and 740 km s^{−1} in their correlation analysis.
It is not surprising that prior studies would not have many data points in the 700–860 km s^{−1} range because for most years the coronal holes are small equatorial holes, and/or there are no large long lived polar coronal hole extensions reaching low latitudes (Figure 2). When studies use only 1 year to test their forecasts and that year has little to no amount of the very fast wind from the polar coronal hole extensions, the models may be optimized to forecast smaller more typical equatorial coronal hole wind that emit only moderately fast wind, and not optimized to forecast the very fast wind from polar coronal hole extensions. Given that the slow, moderately fast, and really fast wind are all well represented in 2003, we recommend others use 2003 to test their forecasts of the background wind or use multiple years to test their forecast abilities.
Owens et al., 2008 found that WSA underpredicted the maximum speed in high speed streams near Earth in the ecliptic, and McGregor, Hughes, Arge, Owens, Odstrĉil, 2011, McGregor, Hughes, Arge, Odstrcil, Schwadron, 2011 made adjustments to the WSA speed formula in order to reproduce the high speeds found by Ulysses in polar coronal hole observations. Our work shows that the current WSA speed formula still needs refinement for the polar coronal hole extensions and our correction maps significantly improve the results particularly for the very fast wind from the polar coronal hole extensions. We found that the year 2003 was quite challenging to forecast compared to the data used in the other studies, and the 3day lead time WSA speed forecast had a correlation of 0.40. When we applied our corrections, the correlation coefficient increased to 0.63, and the histogram of residual speed errors became very symmetric.
There are some other potential sources of error not accounted by WSA or our correction methods. At low speed, the slow wind may not be associated with highly bent open field lines and could be from the release of slow wind associated with streamers, which have a closed field line geometry. The WSA speed formula does not simulate such a process. Also, the dynamic interactions that occur en route are likely more complex than the WSA 1D kinematic propagation code simulates.
4 Summary and Conclusions

There are systematic residual speed errors in expansion factorangle (f_{p} – d) space for the ADAPTWSA speed forecasts.

Residual speed error maps in f_{p} – d space provide insight into how to the WSA formula can be adjusted to improve the model speed formula.

It is possible to make significant improvements in the speed forecasts by using the residual error maps as correction factors.
We conclude that these residual speed error map corrections can be applied in realtime to the speed forecasts because the expansion factor (f_{p}) and angle to the coronal hole boundary (d) are saved along with the forecasted speeds in the output of the WSA model. Our technique which we tested for the 3day lead time ADAPTWSA speed forecasts can be extended to other forecast lead times since ADAPTWSA runs include forecast lead times from 1 to 7 days in 1day increments and residual error correction maps can be produced for each lead time using the same techniques.
A key implication of this work is that by improving the solar wind speed multiday forecasts, the WSA model can be expanded to include multiday forecasts of the solar wind properties such as density, and temperature. For example, there is a strong relationship between the solar wind temperature and speed (Elliott et al., 2005, 2012, 2016), and a strong relationship between the solar wind density and speed (Elliott et al., 2016). These additional solar wind parameters combined with other in situ (e.g., suprathermal particle population) and remote observables (e.g., active regions and solar flare locations) are needed to forecast energetic particle occurrences and properties in the interplanetary space (Dayeh et al., 2010, 2018; Papaioannou et al., 2016). Similarly, the solar wind speed and additional solar wind parameters can be used to forecast geomagnetic activity indices (Bala & Reiff, 2012; Elliott et al., 2013; Luo et al., 2017; Wintoft & Wik, 2018). Superposed epoch studies such as the one by Borovsky and Denton (2010) show that the interplanetary magnetic field strength is enhanced in the compressions of Corotating Interaction Regions as the solar wind speed rises. This means that improved speed forecasts may also enable multiday forecasts of the interplanetary magnetic field strength.
Acknowledgments
This work has been partially supported by and Space Weather OperationstoResearch award 80NSSC21K0027 and the ACE mission award 80NSSC18K0223. Maher Dayeh acknowledges partial support from NASA grant NNX13AI75G, Livingwithastar award 80NSSC19K0079, and Space Weather OperationstoResearch award 80NSSC20K0290.
Open Research
Data Availability Statement
This work utilizes data produced collaboratively between the Air Force Research Laboratory (AFRL) and the National Solar Observatory (NSO). The Air Force Data Assimilative Photospheric Flux Transport (ADAPT) model development is supported by AFRL. The ADAPT maps are publicly available at NSO: https://www.nso.edu/data/nispdata/adaptmaps/. This work also utilizes the public OMNI solar wind data set available at Goddard Space Flight Center https://omniweb.gsfc.nasa.gov. The combined ADAPT and WangSheeleyArge (WSA) models have been untilized in this study https://ccmc.gsfc.nasa.gov/models/WSA~v.2.2/.