Differences between the CME fronts tracked by an expert, an automated algorithm, and the Solar Stormwatch project

Observations from the Heliospheric Imager (HI) instruments aboard the twin STEREO spacecraft have enabled the compilation of several catalogues of coronal mass ejections (CMEs), each characterizing the propagation of CMEs through the inner heliosphere. Three such catalogues are the Rutherford Appleton Laboratory (RAL)‐HI event list, the Solar Stormwatch CME catalogue, and, presented here, the J‐tracker catalogue. Each catalogue uses a different method to characterize the location of CME fronts in the HI images: manual identification by an expert, the statistical reduction of the manual identifications of many citizen scientists, and an automated algorithm. We provide a quantitative comparison of the differences between these catalogues and techniques, using 51 CMEs common to each catalogue. The time‐elongation profiles of these CME fronts are compared, as are the estimates of the CME kinematics derived from application of three widely used single‐spacecraft‐fitting techniques. The J‐tracker and RAL‐HI profiles are most similar, while the Solar Stormwatch profiles display a small systematic offset. Evidence is presented that these differences arise because the RAL‐HI and J‐tracker profiles follow the sunward edge of CME density enhancements, while Solar Stormwatch profiles track closer to the antisunward (leading) edge. We demonstrate that the method used to produce the time‐elongation profile typically introduces more variability into the kinematic estimates than differences between the various single‐spacecraft‐fitting techniques. This has implications for the repeatability and robustness of these types of analyses, arguably especially so in the context of space weather forecasting, where it could make the results strongly dependent on the methods used by the forecaster.


Introduction
Coronal mass ejections (CMEs), eruptions of predominantly coronal plasma and magnetic flux out into the heliosphere [e.g., Webb and Howard, 2012], are widely recognized as a key driver of space weather [Hapgood, 2011;Cannon et al., 2013].Society's increasing need to effectively mitigate the risks associated with space weather hazards [Hapgood, 2011] motivates continued research into the physics of CMEs, in particular the dynamics of CME eruption and propagation into and through the heliosphere.Such research will improve the accuracy of space weather forecasts, which are now being implemented on an operational basis by a number of agencies and businesses across the globe.
Since late 2006, the Heliospheric Imager (HI) instruments aboard the twin STEREO spacecraft have imaged the sunlight scattered from electrons in the outer corona and inner heliosphere, allowing the motion of plasma density structures to be inferred.These observations have provided a valuable means to investigate solar wind transients responsible for terrestrial space weather events, such as CMEs and corotating interaction regions (CIRs).Any such investigation must first identify and characterize the solar transient as it propagates through the HI field of view (FOV).It is most common to discuss the location of features in the HI FOV in helioprojective radial coordinates: elongation angle () and position angle (PA).The  of a target is equal to the angle between the observer-Sun center vector and observer-target vector, and the PA is equal to the angle in the image plane between the target-Sun center vector and the direction of solar north, in an anticlockwise sense.Accordingly, for many analyses, characterizing a CME in the HI FOV constitutes identifying the time-elongation (t-) profile of the CME front along a particular PA direction.Once the t- profile of a CME front is identified and characterized, techniques can be applied to estimate, for example, the CME speed and propagation direction [Rouillard et al., 2008;Lugaz, 2010;Davies et al., 2012;Möstl and Davies, 2012;Harrison et al., 2012;Tucker-Hood et al., 2015;Möstl et al., 2014].
In the context of analyzing HI data, the t- profile is typically manually identified; for example, the SATPLOT tool (http://tinyurl.com/satplot)and the online CDPP-Propagation Tool (http://propagationtool.cdpp.eu/)have been developed to facilitate such work.As with any manual identification by a human observer, this is a subjective process with both random and systematic errors associated with each observation.Williams et al. [2009] investigated the magnitude of the uncertainty with which a t- profile could be manually identified and concluded that this was between 1 ∘ and 2 ∘ , depending on both the brightness of the t- profile and its location in the HI FOV.Möstl et al. [2011] demonstrated that by averaging repeated identifications of the same t- profile, the uncertainty in the elongation coordinates of the profile was approximately 0.5 ∘ , although this process would only reduce the magnitude of the random error, not the systematic error, if it is undertaken by the same observer using the same techniques.
There are now several publicly available catalogues of CMEs observed by the HI instruments, each of which provides the t- profiles of the CME fronts that are identified by independent methods.The Rutherford Appleton Laboratory (RAL)-HI event list (www.stereo.rl.ac.uk/HIEventList.html) is one such catalogue of solar transients observed in HI, which includes both CME-and CIR-associated transients.For each event the catalogue provides the t- profiles of the solar transients in the PA corresponding to the ecliptic plane, which have been manually identified and characterized by a single expert observer.The recently released Solar Stormwatch (SSW) CME catalogue [Barnard et al., 2014] has made available "consensus" t- profiles of CME fronts in the HI FOV that are an average of many individual t- profiles generated by volunteers participating in the Solar Stormwatch citizen science project (www.solarstormwatch.com).These consensus t- profiles of the CME fronts should not suffer from the subjectivity of the manual identification of t- profiles by any one individual, as this is minimized by averaging the observations of many individuals.In the near future, the HELCATS project (www.helcats-fp7.eu)will also provide a catalogue of CMEs generated by expert identification using the STEREO-HI data.
At the time of writing, we are unaware of any publicly available automated algorithms for cataloging solar transients, either CMEs or CIRs, in the HI images.This is in contrast to investigations employing coronagraph data, for which many automated algorithms have yielded widely used CME catalogues, such as CACTus [Robbrecht et al., 2009], CORIMP [Byrne et al., 2012;Morgan et al., 2012], ARTEMIS [Floyd et al., 2013], and SEEDS [Olmedo et al., 2008].However, the HELCATS project is also expected to soon release results from applying a version of the CACTus algorithm to the HI data.Furthermore, we also note that Tappin et al. [2012] developed the AICMED algorithm for detecting CMEs in the Solar Mass Ejection Imager data.We have produced a simple algorithm, J-tracker, that emulates the method of identifying and tracking of solar transients employed by the RAL-HI event list.In comparison with both the RAL-HI event list and the Solar Stormwatch procedure, J-tracker is faster to implement.A further advantage of applying an automated algorithm is that the results are repeatable when the procedure is applied to a fixed data set.
As several different catalogues of CMEs seen by HI are now available, with each providing the t- profiles of some CMEs common to each list, this investigation addresses the question of how do these t- profiles compare.More specifically, we aim to answer two primary questions: what are the quantitative differences between the t- profiles provided by these three catalogues and do these differences significantly affect estimated event properties, particularly CME propagation direction and speed, crucial parameters for space weather forecasting?Furthermore, both the Solar Stormwatch and J-tracker catalogues are generated using new methods.Therefore, comparing the Solar Stormwatch and J-tracker catalogues with the expert-generated RAL-HI catalogue serves as an evaluation of these new methods.Though, of course, the RAL-HI results cannot be taken as an absolute truth; i.e., any difference between Solar Stormwatch, J-tracker, and the RAL-HI catalogues does not imply that any or either of the catalogues is incorrect.
The outline of this article is as follows: section 2 introduces the data used throughout the study, describing the STEREO-HI data and the RAL-HI, Solar Stormwatch, and J-tracker catalogues; section 3 discusses the methods employed to make the comparison between the catalogues; section 4 presents the results of this investigation; section 5 concludes this article with a discussion of the results.

Data
This section briefly reviews the data generated by the HI instruments, which is used in the creation of all the data sets employed in this study, as well as the three catalogues of t- profiles compared here; the RAL-HI catalogue, the Solar Stormwatch CME catalogue, and the J-tracker catalogue.

BARNARD ET AL.
COMPARING THREE CATALOGUES OF CME FRONTS 2

STEREO-HI
The STEREO spacecraft were launched in late 2006 into Earth-like heliocentric orbits, one ahead (STEREO-A: STA) and one behind (STEREO-B: STB) the Earth.The two spacecraft have been gradually separating from the Earth at a rate that until the recent conjunction in 2015, increased the spacecraft-Sun-Earth angle by approximately 22.5 ∘ per year.Both spacecraft carry the Sun Earth Connection Coronal Heliospheric Investigation suite of imaging instrumentation, which includes the HI instrument [Howard et al., 2008;Eyles et al., 2008].Each STEREO-HI instrument contains two wide-field white-light cameras (HI1 and HI2) that can image solar wind structures such as CMEs over a total elongation angle range from near 4 ∘ to 90 ∘ from the Sun.HI1 has a 20 ∘ FOV, extending from 4 ∘ to 24 ∘ in the ecliptic plane with a nominal image cadence of 40 min and a resolution of 70 arc sec.HI2 has a 70 ∘ FOV with a nominal image cadence of 120 min and resolution of 4 arc min.In the ecliptic plane the HI2 FOV extends from 18.8 ∘ to 74 ∘ , spanning slightly less than 70 ∘ due to the presence of a trapezoidal occulter that blocks the intense light from Earth [Eyles et al., 2008].In standard operations the FOV of both HI1 and HI2 is centered in the ecliptic plane.Combining the HI1 and HI2 FOVs allows CMEs to be characterized over a continuous elongation range in the ecliptic spanning 4 ∘ to 74 ∘ .
Like a coronagraph, HI images solar wind density structures via sunlight that has undergone Thomson scattering from free electrons in the solar wind plasma.However, the majority of the signal received by the cameras results from light scattered from interplanetary dust (the F-corona) and this needs to be removed from the images before the density structures associated with solar wind transients can be seen.As, relative to the HI FOV, the F-corona varies slowly in time, it can be characterized over a small sequence of images (on the order of days) and subtracted from each image to produce images that reveal structures within the solar wind.Alternatively, consecutive images can be subtracted from each other to produce difference images.In this way, the contribution of relatively static features, such as the F-corona and the background star field, is minimized, while moving transient enhancements and depletions in the electron density appear as brighter and darker features, respectively [Davies et al., 2009].
As previously mentioned, it is common to characterize a CME by identifying the t- profile of its front along a constant PA.Often the t- profile is extracted from a t- map, also known as a J-map [e.g., Davies et al., 2009].J-maps are constructed by extracting the brightness profile as a function of elongation, averaged over a limited PA range (typically a few degrees) from a series of images, and stacking these vertically as a function of time on the x axis.In such J-maps, antisunward propagating transients have positive gradients.An example of a J-map, built from both HI1 and HI2 differenced images, can be seen in Figure 1a.This work makes use of J-maps constructed from both HI1 and HI2 differenced images, created from data within a 5 ∘ PA band centered on the ecliptic plane, for both STA and STB.In each J-map, in the overlap region between the HI1 and HI2 FOVs, the latter takes precedence at  = 18.8 ∘ .This is because in this region the increased sensitivity of HI2 relative to HI1 makes the transients appear brighter in the differenced images (when a uniform scaling is applied to the differenced image).
The CMEs analyzed in this work have been identified in HI data spanning the period from April 2007 until February 2010.Over this period the separation between STA and STB has increased from approximately 3 ∘ of longitude (in Heliocentric Earth ecliptic coordinates) to 136 ∘ , while the separation of STA and STB from Earth increased from approximately 1.5 ∘ to 65 ∘ and 71 ∘ , respectively.

The Rutherford Appleton Laboratory HI Event List
A catalogue of solar transients, including both CME-and CIR-associated features, has been produced by a single expert observer at the Rutherford Appleton Laboratory (RAL).This data set is accessible online as the RAL STEREO/HI event list (http://www.stereo.rl.ac.uk/HIEventList.html).The solar transients in this catalogue were both identified and tracked within J-maps created from HI1 and HI2 differenced images, in a 5 ∘ PA band centered on the ecliptic plane.The transients were tracked along their light-to-dark boundary that is created by the transients propagating through the differenced images.Lugaz et al. [2012] demonstrated that this boundary tracks a region near the middle of the feature in the background-subtracted images.This method was chosen because this boundary is often the most well-defined feature of a transient's profile [Lugaz et al., 2012].A necessary criterion for a transient to be included was that it was observable into the HI2 FOV, corresponding to an elongation of 18.8 ∘ in the ecliptic plane.Each profile was only tracked once and is not the average of repeated characterizations.This catalogue contains 1660 transients observed by STA and 981 transients observed by STB, over the period April 2007 to December 2011.This catalogue was produced prior to, and independently of, the present study and will hereafter be referred to as the EXP ("expert") catalogue.

Solar Stormwatch CMEs
Solar Stormwatch (http://www.solarstormwatch.com) is a Zooniverse citizen science project, the main objective of which is to identify and characterize CMEs observed by the HI instruments [Barnard et al., 2015].The project has been running for approximately 5 years, with input from > 16, 000 citizen scientists, resulting in a data set of > 38, 000 manually extracted t- profiles of CME trajectories.The CMEs are tracked in an 85 ∘ PA window in the HI FOV, 47.5 ∘ to 132.5 ∘ for STA and 227.5 ∘ to 312.5 ∘ for STB.Each PA window is divided into 17 preselected contiguous PA bands, each 5 ∘ wide, as well as a final 5 ∘ PA band centered on the ecliptic plane (the PA of which evolves through the year).Using the J-mapping technique, the CMEs are independently tracked through each of the 18 PA bands.
These observations have recently been processed into a CME catalogue [Barnard et al., 2014], consisting of 144 CMEs over the period January 2007 to February 2010, of which 110 were observed by STEREO-A and 77 were observed by STEREO-B.For each CME, the t- profiles generated by the citizen scientists are averaged into a consensus profile for each PA along which the event was tracked.Calculation of the consensus profiles is described in Barnard et al. [2014], but, in summary, each consensus profile is calculated from the mean of the t- profile coordinates in 3 ∘ wide elongation bins.The catalogue is publicly accessible at www.met.reading.ac.uk/∼spate/solarstormwatch.This study employs only a subset of the full SSW catalogue, only considering events that were tracked in the ecliptic plane, as only these can be compared with the events in the EXP catalogue.This data set will hereafter be referred to as the SSW ("Solar Stormwatch") catalogue.

J-Tracker
J-tracker was designed to automatically emulate the manual identification and characterization of t- profiles associated with solar transients that is performed to create the RAL-HI event catalogue described above.To do this, a combination of image processing techniques is applied to the same J-maps used to establish the RAL-HI catalogue.An example of these J-maps can be seen in Figure 1a.The feature of the transients that guided the expert's identification and tracking was the light-to-dark interface at the sunward (righthand) side of the brightness (density) enhancement in the J-map, because this typically is the most well-defined feature.
The J-tracker algorithm emulates this process in several stages: rescaling and smoothing, to enhance the visibility of the transient features, and edge detection and feature extraction, to identify and characterize the t- profile associated with each transient feature.The brightness intensity in the HI images typically decreases with increasing elongation from the Sun.Therefore, with a uniform scaling applied to each J-map, transient features become less distinct from the background at larger elongations.This is evident from the standard differenced image J-map in Figure 1a, in which the range of intensity variations decreases with increasing elongation through each of the HI1 and HI2 fields of view (separated at 18 ∘ elongation).To increase the visibility of transients at larger elongations, rather than using a uniform scaling, a scaling that varies with elongation is employed.Specifically, the J-maps are processed in Carrington solar rotation blocks (from the relevant spacecraft's perspective) and the brightness at each elongation is linearly normalized between the 20th and 80th percentiles of the brightness distribution along that elongation, over that Carrington solar rotation block.Second, a 5 × 5 median filter, which suppresses small-scale structure, is applied to the J-map, making clearer the larger-scale signatures of the t- profiles relating to transient features, such as CMEs and CIRs.It also smoothes the resolution discontinuity between the HI1 and HI2 data in the J-map.The results of the rescaling and smoothing can be seen in Figure 1b.
Following this, a Canny-type edge detector is applied to the rescaled and smoothed J-map to identify regions of large brightness gradient.The formulation of the Canny edge detector is described in Canny [1986], and we use the implementation provided in the MATLAB Image Processing Toolbox.The edge detector is sensitive to all features that cause large brightness gradients in the J-maps, including some features that are not of interest.In particular, planets and bright stars in the HI FOV cause bright, temporally extended traces in J-maps, where the brightness gradient tends to vary more in elongation (vertical axis), than in time (horizontal axis).These trails, primarily caused by planets, are removed by rejecting edges where more than 75% of the brightness gradient magnitude is directed along the elongation axis.This criterion was arrived at empirically but appears to work satisfactorily, removing planet trails without overtly impacting the detection of solar wind transients.
Both the dark-to-light (antisunward) and light-to-dark (sunward) edges can be identified, such that it is possible that both the front and back of the same brightness enhancement can be followed.Here the use of "sunward" and "antisunward" edges refers specifically to their relative locations in the differenced images and J-maps.Lugaz et al. [2012] discusses the relative locations of features in differenced images and background-subtracted images.Presently, we only want to obtain one t- profile per event.As the light-to-dark edge is typically better defined, as can be seen in Figures 1a and 1b, we choose to extract only the light-to-dark edges and reject the dark-to-light edges.Figure 1c demonstrates the results of applying the edge detection and selection criteria to the smoothed and rescaled J-map presented in Figure 1b; the image edges corresponding to potential features of interest are shown as white lines.The effects of the criteria used to reject planet trails can be seen near 40 ∘ elongation, in the diagonal blank region, which, in this instance, is caused by a bright star.
Finally, the J-tracker catalogue is formed by extracting the t- profiles of all remaining edges that begin between 5 ∘ and 9 ∘ elongation and extend continuously for more than 5 ∘ , with a positive gradient.Edges with breaks of more than one pixel (in either direction) or that have a negative gradient (implying sunward propagation) are rejected.These are arbitrary criteria that were chosen to balance robustly identifying clear features associated with solar transients and minimizing the identification of false-positive events and the computation time.In Figure 1d, the entries in the J-tracker catalogue are overlaid on the J-map from Figure 1a, with green dots marking the t- profiles.In this example the J-tracker algorithm does well at identifying the main features in the J-map.We note that much like the EXP catalogue, J-tracker does not exclusively identify BARNARD ET AL.COMPARING THREE CATALOGUES OF CME FRONTS CMEs and includes transients of other origin, such as those associated with CIRs.This catalogue contains 917 transients observed by STA and 521 transients observed by STB, spanning the period April 2007 to December 2011, i.e., the same period as the EXP catalogue.This data set will hereafter be referred to as the JTR ("J-Tracker") catalogue.

Matched Event List
To compare the The matched event list was created by visually inspecting the ecliptic J-maps overlaid with the ecliptic t- profiles corresponding to the 106 SSW events, as well as every EXP and JTR event profile that also began within a ±10 h window of the SSW event onset.Specifically, we aimed to visually identify the EXP and JTR profiles that most closely tracked the same feature identified by the SSW profile.The ±10 h window width was chosen as it is large enough that we consider it highly unlikely that profiles separated by more than this will be genuinely associated.An example of a set of matched events is shown in Figure 2a.Each of the plots used in deriving the matched events list are available for viewing online at goo.gl/igg1LM; each plot uses the same color coding as Figure 2a.The resulting matched event list only includes events for which it was possible to match both an JTR and an EXP event with a SSW event.In total there are 51 (34 by STA and 17 by STB) such matched events.This list is available in the supporting information of this paper.

Calculating Differences Between the EXP, JTR, and SSW Profiles
To directly compare the EXP, JTR, and SSW t- profiles, we analyze the elongation differences at fixed times, Δ, and timing differences at fixed elongations, Δt, of the JTR and SSW profiles relative to the EXP profile.Specifically, for each point in the JTR and SSW t- profiles, we calculate, by linear interpolation, the corresponding time coordinate at fixed elongation, and elongation coordinate at fixed time, of the EXP profile.With the 10.1002/2015SW001280 interpolated EXP coordinates we then calculate the timing and elongation differences of each point along the JTR profiles, Δt J and Δ J , as well as of the SSW profiles, Δt S and Δ S according to where the subscripts e, j, and s refer to the EXP, JTR, and SSW profiles, respectively.This is demonstrated graphically by the schematic in Figure 2b.These calculated differences form the basis of our comparison between the three sets of profiles.We limit calculating the differences to the coordinate range spanned by each EXP profile; i.e., the EXP profile is not extrapolated beyond its actual coordinate boundaries.In the 51 matched events the JTR and SSW profiles include a total of 2105 and 815 t- coordinates, respectively.This allows us to calculate 2057 differences for the JTR profiles and 633 differences for the SSW profiles.

Single-Spacecraft-Fitting Methods
Methods have been developed that allow estimation of the radial speed and propagation direction of a CME from its t- profile, observed from a single satellite.Three widely used methods are fixed-phi fitting (FPF) [Sheeley et al., 1999[Sheeley et al., , 2008;;Rouillard et al., 2008], harmonic-mean fitting (HMF) [Lugaz, 2010], and self-similar-expansion fitting (SSEF) [Davies et al., 2012;Möstl and Davies, 2012].These methods each assume a fixed geometry for the CME structure and that the CME propagates at a constant speed in a fixed radial direction.For each model a theoretical expression for the elongation angle as a function of time has been derived such that a numerical fit between the observed t- profile and the theoretical elongation angle variation can yield estimates of the CME speed (V cme ) and direction ().The FPF technique is the simplest of these three, modeling the CME as a point-source moving radially outward from the Sun at constant speed.
The HMF technique models the CME as a radially expanding circle anchored at the Sun center.Finally, the SSEF technique models the CME as a radially expanding circle that is not anchored to the Sun but that subtends a fixed angle with respect to the Sun center.Therefore, the SSEF technique can model a continuum of CME geometries, for which the FPF and HMF techniques are two limiting cases.Möstl et al. [2014] reviewed the performance of these single-spacecraft-fitting methods and demonstrated that the FPF method provides the least biased estimate of the CME trajectory, with both the HMF and SSEF (using a CME half width of 45 ∘ ) methods tending to give biased estimates of .Uncertainties on the estimated quantities can be calculated according to the method described by Rouillard et al. [2010] and Williams et al. [2009], although we note that these error estimates only relate to the quality of the numerical fit between the observed and theoretical profiles; for each of these models there is an additional unquantified error that depends on how valid the assumptions of each model are for a given event [Savani et al., 2012;Möstl et al., 2014].Here the best fit V cme and  estimates are calculated using MATLAB's nonlinear least squares curve fitting function, lsqcurvefit, and the fit errors are calculated using the Williams et al. [2009] method.

Comparison of All Differences
Figures 3a and 3b show histograms of the distributions of Δ S and Δ J , respectively.The histograms are calculated using elongation bins 0.5 ∘ wide.The center of the Δ S distribution (EXP-SSW) is clearly more negative than the zero-difference line, with a mean of −1.76 ∘ ± 0.09 ∘ .Conversely, the central tendency of the Δ J distribution (EXP-JTR) is much closer to zero and positive, with a mean of 0.28 ∘ ± 0.03 ∘ .The Δ S distribution is broader than the Δ J distribution, with each having standard deviations of 2.24 ∘ and 1.52 ∘ , respectively.These summary statistics, as well as summary statistics of subsequent results, are included in Table 1.
Neither distribution appears to have any significant skewness.These observations indicate that the JTR profiles are typically closer in elongation to the EXP profiles than the SSW profiles are, with the SSW profiles having a larger systematic bias to greater elongations (negative Δ S ), as well as more variability.
We can analyze this in more detail by considering how (a and b) Histograms of the distributions of Δ S and Δ J .The histograms are calculated using elongation bins 0.5 ∘ wide.The black dashed lines highlight the zero-difference point, while the orange lines mark the cumulative distribution functions corresponding to each histogram.(c and d) The elongation differences as a function of time after the event onset.Each of the Δ S and Δ J points are shown as grey dots.These points were divided into 5 h wide bins and, for each bin containing more than 20 points, from more than 10 events, the median, 25th and 75th percentiles, and the 10th and 90th percentiles of Δ were calculated.These are shown as the purple, blue, and red lines, respectively.
values were binned into 5 h wide bins, according to time after onset, and the median, the 25th and 75th percentiles, and the 10th and 90th percentiles calculated.To help ensure this analysis is not biased by poor sampling, these percentiles were not calculated for any bin which contained fewer than 20 samples, or if the samples came from fewer than 10 events.
We first consider the behavior of Δ S in Figure 3c.As the time from event onset increases, there appears to be a weak trend in the median to more negative values of Δ S and the plotted percentiles diverge from each other.This indicates that the center of the Δ S distribution moves farther from the zero line as a function of (a and b) Histograms of the distributions of Δt S and Δt J .The histograms are calculated using time bins 1 h wide.The black dashed lines highlight the zero-difference point, while the orange lines mark the cumulative distribution functions corresponding to each histogram.(c and d) The timing differences as a function of elongation.Each of the Δt S and Δt J points are shown as grey dots.These points were divided into 3 ∘ wide bins, and, for each bin containing more than 20 points, from more than 10 events, the median, 25th and 75th percentiles, and the 10th and 90th percentiles of Δt were calculated.These are shown as the purple, blue, and red lines respectively.
time after the event onset and that the variability in Δ S increases.This means that there are larger differences between the SSW and EXP profiles at later times in the t- profile.
In contrast, Figure 3d shows that the median value of Δ J stays close to the zero line over the entire range of times from event onset.The 10th, 25th, and 75th percentiles do not diverge much from one another, although the 90th percentile diverges quickly, appearing heavily affected by a few outlying events.Generally, the differences in elongation between the JTR and EXP profiles are fairly consistent throughout these events.
This analysis was repeated for the time differences, Δt S and Δt J .However, we note that the time and elongation differences are highly (negatively) correlated and consequently analyzing the timing differences does not add independent information to this investigation.It is for completeness that we have performed this investigation in terms of both variables, as there is no clear reason to select one in preference to the other.Figure 4 is composed identically to Figure 3, except that it instead considers the timing differences.The histograms in Figures 4a and 4b were created using a bin width in Δt S and Δt J of 1 h.Figures 4c and 4d present the variation of the time differences as a function of elongation.In this instance the percentiles were calculated using 3 ∘ wide elongation bins, and only for bins with more than 20 samples, from no fewer than 10 events.The distributions of the timing differences are, as expected, consistent with the distributions of the elongation differences, also demonstrating that the SSW profiles have a larger systematic difference from the EXP profiles than do the JTR profiles.

Comparison of Event-Averaged Differences
The results discussed in section 4.1 will not apply to each event equally, and so it is informative to also consider the event-averaged time and elongation differences.Here, for each event, we calculate the mean and root-mean-square (RMS) values of the time and elongation differences, which we refer to as ⟨Δt⟩, ⟨Δ⟩, [Δt] rms , and [Δ] rms .Figure 5 shows a comparison of the mean and RMS values of Δ and Δt, calculated for each event.
Figure 5a compares the event means of Δ S and Δ J . Figure 5c compares the event means of Δt S and Δt J . Figure 5b compares the RMS values of Δ S and Δ J . Figure 5d compares the RMS values of Δt S and Δt J .
Figures 5a and 5c show that similar relationships exist between the SSW, JTR, and EXP event-averaged differences to those found in section 4.1 for individual points along the t- profiles.The values of ⟨Δ J ⟩ BARNARD ET AL.COMPARING THREE CATALOGUES OF CME FRONTS 9 and ⟨Δt J ⟩ cluster around the zero-difference line, with means of 0.08 ∘ ± 0.20 ∘ and −0.02±0.36h, respectively.The ⟨Δ S ⟩ and ⟨Δt S ⟩ are systematically offset from the zero-difference line, with means of −1.72 ∘ ± 0.25 ∘ and 3.12 ± 0.43 h, respectively.Table 1 shows that these values are similar in sign and magnitude to the mean values of all the differences investigated in 4.1.Now considering the RMS differences in Figures 5b and 5d, in each plot the population of points is strongly biased to locations above the one-to-one line.This also indicates that the JTR profiles are typically closer to the EXP profiles than the SSW profiles are.

Comparison of J-Map Brightness and Brightness Gradient Distributions
By design, the JTR profiles track the light-to-dark boundary at the sunward side of the brightness enhancement created by the motion of the CME front through the FOV of the HI differenced images.Similarly, it is known that this boundary was also used in identifying and extracting the EXP profiles.The statistical results presented in Figures 3-5 are consistent with this.The EXP and JTR profiles are generally close to each other, with a bias indicating that on average, the EXP profiles lead the JTR profiles by a small amount.The SSW differences revealed a much larger bias which indicates that on average, SSW profiles lead both the EXP and JTR profiles.We interpret these results as follows: the JTR profiles successfully track the light-to-dark boundary on the sunward side of a transient's profile; the EXP profiles track close to the light-to-dark boundary, with a small bias toward being located more inside the brightness enhancement; and the SSW profiles track inside the brightness enhancement, located more toward the antisunward edge of the transient.
This interpretation was validated by comparing the distributions of J-map brightness and J-map brightness gradient magnitude at the coordinates of the t- profiles for each of the SSW, EXP, and JTR profiles.If our interpretation is correct, the SSW profiles should typically be located in regions of higher-intensity brightness BARNARD ET AL.COMPARING THREE CATALOGUES OF CME FRONTS 10 and smaller brightness gradient magnitude (being inside the brightness enhancement and farther from the light-to-dark boundary), and the JTR profiles should typically be located in regions of less intense brightness and larger brightness gradient magnitude.In the following, we calculate the J-map brightness gradient magnitude as the absolute value of the horizontal (time) gradient in the J-map, such that the brightness gradient magnitude at the coordinates where B is the J-map brightness and the subscript i indexes the set of observation times of the J-map.
As can be seen in Figure 1a, the brightness of features seen in HI1 and HI2 differenced J-maps typically decreases with increasing elongation, due to the reasons discussed in section 2.4.Consequently, the distributions of J-map brightness and brightness gradient magnitude vary with elongation.This must be accounted for if we are to compare the distribution of J-map brightness and brightness gradient magnitude at the t- coordinates of the EXP, JTR, and SSW profiles spanning all elongations.Here this is done by normalizing the J-map brightness and brightness gradient magnitude at each elongation.Both quantities are normalized using the maximum and minimum values observed at that elongation, over a 27 day period centered on the event onset.The brightness and brightness gradient magnitude values at the coordinates of the t- profiles are then calculated using 2-D nearest-neighbor interpolation.
Figure 6 shows histograms of the distributions of (a) the J-map brightness and (b) brightness gradient magnitude for the EXP, SSW, and JTR profiles.Each histogram was calculated using bin widths of 0.05 in the rescaled units.The brightness distributions have similar modal values but, as surmised earlier, the location of the SSW distribution appears to be slightly offset to higher-intensity brightness than the and EXP profiles.A one-tailed, two-sample Kolmogorov-Smirnov test was applied to examine the null hypothesis that the underlying SSW brightness distribution was the same as each of the underlying EXP and JTR brightness distributions, with the differences due only to random sampling.For both of the SSW-JTR and SSW-EXP pairings, the null hypothesis was rejected with >99% probability, showing that there are significant differences between the distributions that are larger than would be expected due to random sampling alone.
The brightness gradient magnitude distributions (Figure 6b) clearly show that the SSW profiles are more frequently located in regions of lower brightness gradient than the EXP and JTR profiles.Both the EXP and JTR distributions display large increases of relative frequency in the highest brightness gradient bin.This shows that the JTR and EXP profiles are frequently located in regions where the brightness gradient is close to the maximum value in the 27 day window used to calculate the rescaled values.No such large increase is observed in the SSW distribution.These observations are also consistent with the predictions made above, supporting our interpretation of the time and elongation differences between the SSW, EXP, and JTR profiles.

Comparison of Tracked Elongation Range
Previous research has demonstrated that the accuracy of estimates of the speed and propagation direction of a solar transient, derived from the single-spacecraft-fitting methods discussed in section 3.3, increases with the elongation range over which the transient is tracked [Williams et al., 2009].Therefore, in this section, we compare the maximum elongation extent of the EXP, JTR, and SSW profiles.Figure 7a displays histograms of the distributions of the maximum elongation of the three sets of profiles.These histograms were calculated using elongation bins 5 ∘ wide.The JTR distribution has little overlap with the EXP and SSW distributions, BARNARD ET AL.COMPARING THREE CATALOGUES OF CME FRONTS 11 being located farther to the left, at lower elongations.This shows that the JTR profiles do not track transients over an elongation range as large as the EXP and SSW profiles.The EXP and SSW distributions are more similar, but the SSW profiles have a tendency to track CMEs out to further elongations than the EXP profiles.
Figure 7b displays these data differently, as a scatterplot of the maximum elongation of the EXP profiles versus the maximum elongation of the JTR (in green) and SSW (in blue) profiles, with the red line marking the one-to-one line.The distribution of these points around the one-to-one line reveals that the JTR profiles almost exclusively track the CMEs over a shorter elongation range than the EXP profiles, while on average the SSW profiles extend over a greater range of elongation angles.There is positive linear correlation between the maximum elongation of the EXP and SSW profiles (Pearson's r = 0.448, p = 9.8 × 10 −4 ), but no such linear correlation exists between the EXP and JTR profiles (Pearson's r = 0.035, p = 0.807).We interpret this as evidence that the method used to identify the JTR profiles reaches a limit near 20 ∘ elongation, past which it typically fails to track transients.This limit is related to the merging of the HI1 and HI2 FOVs, which occurs at 18.8 ∘ elongation.This potentially impacts the accuracy of CME speed and direction estimates made with the single-spacecraft techniques discussed in section 3.3, as Williams et al. [2009] argued that transients should be tracked out past 30 ∘ to obtain reliable estimates of the CME speed and direction with the FPF technique.

Comparison of Estimated CME Speeds and Trajectories
Figure 8 compares the estimated CME speeds (V) and propagation directions (), calculated using the FPF technique with each of the EXP, SSW, and JTR catalogues.Figures 8a and 8b compare V and  estimates generated from the JTR and EXP profiles, while Figures 8c and 8d compare the V and  estimates generated from the SSW and EXP profiles.In each panel the red line marks the one-to-one line.The scatter of the points about these one-to-one lines demonstrates that for both the V and  estimates, the SSW and EXP profiles yielded more similar estimates than did the JTR and EXP profiles.Considering specifically the comparison of the JTR and EXP V estimates (Figure 8a), it is evident that the points are not distributed evenly around the one-to-one line.The largest residuals are caused by the JTR estimates being much larger in magnitude than the EXP estimates.In contrast, the SSW and EXP V estimates are in better agreement and appear quite evenly distributed about the one-to-one line.By far the poorest agreement is between the JTR and EXP  estimates, where the  JTR display much more variance than  EXP and the two quantities are poorly correlated.Therefore, although the JTR and EXP profiles have more similar structure than the SSW and EXP profiles, the estimates of the CME kinematics from the SSW and EXP profiles are most similar.This is almost certainly due to the limited elongation extent of the JTR profiles yielding poor estimates of V and , as also demonstrated by Williams et al. [2009].
This analysis was extended to consider in more detail how the different CME profiles affect the kinematics estimated from the single-spacecraft-fitting techniques.Given that the kinematic fits to the JTR profiles appear to be poorly constrained, we exclude the JTR profiles from further analysis.To make a fair comparison between the EXP and SSW tracks, the elongation range of each profile is limited to the smaller maximum elongation of the two profiles; by doing so, the only variable between the SSW and EXP profiles is their structure, not their extent.The mean maximum elongation of each profile was 43.9 ∘ , with a standard deviation of 8.42 ∘ , while only two events had maximum elongations of less than 30 ∘ , the minimum being 28 ∘ .Therefore, In each plot the red line marks the one-to-one line.
it was considered appropriate to apply the single-spacecraft-fitting techniques to these events.Speed and direction estimates were then obtained by FPF, HMF, and SSEF, for each of the SSW and EXP profiles.For the SSEF, we assume a CME half width of 30 ∘ for all events.We also repeated the analysis assuming a CME half width of 45 ∘ , as assumed by Möstl et al. [2014], and the results are qualitatively the same.Figure 9 presents these data.Figure 9 (top) displays the speed estimates, while Figure 9 (bottom) displays the direction estimates.Generally, for each event and each fitting technique, there is good agreement between the estimates obtained from the SSW and EXP profiles (cf. Figure 3d from Davies et al. [2012], which compares the HMF and FPF estimates for many more transients, all from the EXP catalogue).However, closer inspection suggests that the different profiles can cause systematic differences between the estimates, which are comparable in magnitude with the differences found by applying the range of methods.This is clearly demonstrated by events 5 and 10 in Figure 9, where the sets of triangles and crosses are separated by more than the variability within the sets of markers.
These differences become clearer if the data are averaged.Figure 10 displays the mean speed and direction estimates for each event, calculated by averaging the results of the FPF, HMF, and SSEF for the SSW profiles (blue squares) and EXP profiles (pink triangles), and the error bars are 1 standard deviation of the three estimates contributing to each average value.The mean values for each profile are often separated by more than the standard deviation of the three estimates.Finally, Figure 11a compares the magnitude of the difference in the mean speeds with the standard deviation of the speeds calculated with the SSW profiles (blue squares) and EXP profiles (pink triangles), while the red line marks the one-to-one line.The distribution of the points is biased to locations below the one-to-one line, with 36 (71%) of the SSW points and 32 (63%) of the EXP points below the line.Figure 11b repeats this analysis for the CME direction estimates and is consistent with Figure 11a, with 31 (61%) of the SSW points and 32 (63%) of the EXP points below the one-to-one line.
Effectively, this shows that the tracking method frequently causes more variability in the speed and direction estimates than does the range of FPF, HMF, and SSEF techniques.
BARNARD ET AL.COMPARING THREE CATALOGUES OF CME FRONTS 13 A limitation to this analysis arises due to biases within the SSW catalogue.Barnard et al. [2014] discusses how SSW is probably biased to identifying the biggest and brightest CMEs observable in the HI FOV; such events are more likely to have a moderately low  value.Furthermore, very few fast CMEs were identified by SSW; in the sample used here, only 4 of 51 probably have speeds > 500 km s −1 .Consequently, regions of the V −  The same structure as Figure 11a but for CME direction estimates.In each case it is clear that more points lie below the one-to-one line than above, indicating that differences between the SSW and EXP profiles frequently cause larger differences in the speed and direction estimates than do the FPF, HMF, and SSEF techniques.
parameter space with high V and/or high  have not been robustly explored by this analysis.Therefore, it is possible that in these regions, the relative differences in the V and  estimates due to the CME tracking and kinematic models will be different than those found here.
Our interpretation of these results is not that either of the EXP or SSW profiles more accurately represents the CME parameters than the other.Indeed, both the SSW and EXP profiles have been used in previously peer reviewed research articles [Barnard et al., 2014;Davies et al., 2012;Tucker-Hood et al., 2015] and are considered acceptable interpretations of the CME front's t- profile.However, this does highlight how sensitive the results of the single-spacecraft-fitting techniques can be to the t- profiles to which they are applied.Consequently, the conclusions of studies employing these techniques are sensitive to precisely how the t- profiles are identified.This complicates the comparison of results across multiple studies if they make use of different representations of an event or a set of events, or simply events are tracked by different observers.

Conclusions
This study provides a quantitative comparison of the t- profiles of 51 fronts common to the Solar Stormwatch (SSW), RAL-HI event list (EXP), and J-tracker (JTR) catalogues of solar transients observed by the Heliospheric Imager (HI) instruments aboard the twin STEREO spacecraft.Each catalogue adopts a different approach for identifying the CME fronts in J-maps generated from the HI observations: SSW is derived from averaging many manual identifications as part of a citizen science project; EXP is derived from the manual identifications by an expert observer; JTR is derived by application of an automated algorithm employing image processing techniques.
By comparing the time and elongation coordinates of each of the profiles, it was demonstrated that on average, the SSW profiles lead the EXP profiles by approximately 1.76 ∘ (3.02 h), while the JTR profiles lag the EXP profiles by approximately 0.28 ∘ (0.38 h).Although the average elongation differences are small, the distributions of the elongation differences are fairly broad, with standard deviations of 2.24 ∘ for the EXP-SSW elongation differences and 1.52 ∘ for the EXP-JTR elongation differences.Studies by Williams et al. [2009] and Möstl et al. [2011] argued that the elongation error in manual expert identification was of the order 0.5 ∘ .The observed differences between the different sets of profiles are often much larger than this and so cannot be explained by random errors in the manually identified EXP profiles.Analysis of the J-map brightness and brightness gradient at the coordinates of the different profiles revealed the cause of the systematic difference between the SSW profiles with the EXP and JTR profiles.Both the EXP and JTR profiles track close to the light-to-dark boundary at the sunward edge of the CME frontal brightness enhancement, while the SSW profiles tend to track farther from the light-to-dark boundary, inside the leading brightness enhancement and closer to the antisunward edge.Tracking the leading edge of the CME front is probably most useful, as it is this that will trigger the onset of the disturbance in the terrestrial space environment.This should be borne in mind when making comparisons between these profiles.Although the light-to-dark boundary may often be easier to track, its predicted arrival at Earth will probably be after the onset of the terrestrial disturbance.

BARNARD ET AL. COMPARING THREE CATALOGUES OF CME FRONTS
Both the SSW and EXP profiles tracked the CME fronts over a similar elongation range, out to, on average, approximately 45 ∘ .However, the JTR profiles could only follow transients over a much more limited elongation range, out to, on average, approximately 15 ∘ .This was attributed to the J-tracker algorithm failing at the merging of the HI1 and HI2 fields of view, which occurs at 18 ∘ in the J-maps used here.Therefore, a key area of development for the J-tracker system should be the improved profiling of events over this boundary.
Estimates of the CME kinematics were obtained by employing single-spacecraft-fitting techniques with the JTR, EXP, and SSW profiles.The CME speed and direction estimates obtained by applying FPF to the EXP and SSW profiles were generally in good agreement, but the agreement between the estimates derived from JTR and EXP profiles was much poorer.This was almost certainly due to the limited elongation extent of the JTR profiles causing the kinematic fits to be poorly constrained, which is consistent with the results of previous studies [Williams et al., 2009].This implies that the J-tracker profiles are currently of limited value when used with the single-spacecraft-fitting methods, although this does not rule out their value in other types of analysis, e.g., automatically identifying the presence of solar wind transients.
Recently, Conlon et al. [2014] investigated the errors in CME kinematics estimated with the FPF technique, caused by the common approximation that the spacecraft is stationary over the duration of the event.
Properly accounting for spacecraft caused differences in the estimated speeds and directions that were typically less than 100 km −1 and 4 ∘ in magnitude, respectively.The results in section 4.5 demonstrate that differences in the estimated CME kinematics arising due to the different methods of profiling the CMEs are typically larger than the differences due to accounting for spacecraft motion.
Closer inspection of the CME speed and direction estimates, obtained by applying the FPF, SSEF, and HMF techniques to the SSW and EXP profiles, revealed that for a given event, the differences between the fitting techniques were frequently smaller than the differences caused by using either the SSW or EXP profile.In this experiment the EXP and SSW profiles were matched in elongation extent so that the only difference between the matched profiles was their structure over the same elongation range.This emphasizes the sensitivity of the single-spacecraft-fitting techniques to the fine-scale structure of t- profiles.Biases in the SSW catalogue mean these results were derived from CMEs with relatively low V and  values and so they may not apply equally to CMEs with high V and/or large .Future work should aim to better establish the relative roles of CME tracking and kinematic models in the variability of the V and  estimates over the whole V- parameter space.

Figure 1 .
Figure1.This series of panels details the operation of the J-tracker algorithm.(a) A standard differenced image J-map formed from HI1-A data (below  = 18.8 ∘ ) and HI2-A data (above  = 18.8 ∘ ), as described in section 2.1, spanning 10 days in February 2010.(b) The J-map after it has been rescaled and then smoothed by a 5 × 5 median filter.(c) Edges in the J-map that meet the criteria discussed in section 2.4.(d) Green dots show the t- profiles of transients identified by J-tracker, overlaid on the same J-map shown in Figure1a.

Figure 2 .
Figure 2. (a)A J-map formed from HI1-A data and HI2-A data, centered on the ecliptic plane, for a period of 10 days around the onset of SSW-STA event 105.The EXP, SSW, and JTR t- profiles are shown as pink triangles, blue squares, and green dots, respectively.The rectangular black region shows a short period of missing data.(b) A schematic of the methodology used to compare the EXP, SSW, and JTR t- profiles.The three sets of t- profiles are compared in time and elongation by calculating the differences, Δt and Δ, found by interpolating the EXP profiles at the coordinates of the SSW and JTR profiles.

Figure 5 .
Figure 5.A comparison of the mean and RMS values of Δ and Δt, calculated over the full t- profile of each event.(a) Comparison of the event means of Δ S and Δ J .(c) Comparison of the event means of Δt S and Δt J .The red dashed lines mark the zero point on each axes.(b) Comparison of the RMS values of Δ S and Δ J .(d) Comparison of the RMS values of Δt S and Δt J .The solid red lines mark the one-to-one line each axes.

Figure 6 .
Figure6.(a) Histograms showing distributions of the normalized J-map brightness values interpolated from the coordinates of the EXP, SSW, and JTR t- profiles, in pink, blue, and green, respectively.(b) Histograms showing distributions of the temporal (horizontal) gradient in J-map brightness interpolated from the coordinates of the EXP, SSW, and JTR t- profiles, using the same coloring as Figure6a.

Figure 7 .
Figure 7. (a) Histograms of the distributions of the maximum tracked elongation of the EXP, SSW, and JTR profiles, in pink, blue, and green, respectively.The histograms were calculated using 5 ∘ wide bins.(b) Comparison of the maximum tracked elongation of both the SSW (blue dots) and JTR (green dots) profiles with the EXP profiles.The red line is the one-to-one line.

Figure 8 .
Figure 8.This sequence of plots compares the CME speed (V) and propagation direction () estimates calculated using the FPF technique for each of the EXP, SSW, and JTR catalogues.(a and b) Comparison of V and  estimates generated from the JTR and EXP profiles.(c and d) Comparison of the V and  estimates generated from the SSW and EXP profiles.In each plot the red line marks the one-to-one line.

Figure 9 .
Figure 9. (top) CME speeds estimated for each event by applying the FPF (black), HMF (red), and SSEF (blue) techniques to the SSW (triangles) and EXP (crosses) profiles.(bottom) The same structure as Figure 9 (top) but for the CME direction estimates.

Figure 10 .Figure 11 .
Figure 10.(top) Mean CME speeds, from averaging the FPF, HMF, and SSEF speeds, derived from the SSW (blue) and EXP (pink) profiles.The error bars are 1 standard deviation of the FPF, HMF, and SSEF estimates.(bottom) The same structure as Figure 10 (top) but for the CME direction estimates.