The Visual Complexity of Coronal Mass Ejections Follows the Solar Cycle

The Heliospheric Imagers on board National Aeronautics and Space Administration (NASA)'s twin STEREO spacecraft show that coronal mass ejections (CMEs) can be visually complex structures. To explore this complexity, we created a citizen science project with the U.K. Science Museum, in which participants were shown pairs of CME images and asked to decide which image in each pair appeared the most “complicated.” A Bradley‐Terry model was then applied to these data to rank the CMEs by their “complicatedness,” or “visual complexity.” This complexity ranking revealed that the annual average visual complexity values follow the solar activity cycle, with a higher level of complexity being observed at the peak of the cycle. The average complexity of CMEs observed by STEREO‐A was also found to be significantly higher than those observed by STEREO‐B. Visual complexity was found to be associated with CME size and brightness, but our results suggest that complexity may be influenced by the scale‐sizes of structure in the CMEs.


Observations of CMEs
Coronal mass ejections (CME) are eruptions of plasma and magnetic field from the Sun, which travel outward into interplanetary space (e.g., Webb & Howard, 2012). Should these reach the Earth, they can trigger intense geomagnetic storms, which can cause widespread power cuts (Eastwood et al., 2018), damage to satellites and disrupt radio communications (Cannon et al., 2013). With our growing reliance on technology, we are becoming increasingly vulnerable to these storms, making it essential we learn as much as we can about the nature and evolution of CMEs.
CMEs have been observed both as they leave the Sun using coronagraphs, and as interplanetary CMEs (ICMEs) in in situ data since the 1970s (Tousey, 1973). In coronagraph images, CMEs are often observed to have a three-part structure, which resembles a lightbulb, consisting of a bright outer front followed by a cavity, with a prominence inside (Illing & Hundhausen, 1986). However, CMEs can appear very differently. This may be due to differences in CME properties, or due to the relative viewpoint of the observer. In coronagraph images, CMEs are projected onto the camera's image plane, meaning that observed CME properties such as angular width and speed are unlikely to be the true values (e.g., Howard, Nandy, et al., 2008). This means that if a lightbulb-shaped CME is ejected at an inconvenient angle compared to the imager, the CME might not appear to have this structure (Vourlidas et al., 2017). CMEs vary in width, from jets, where the CME appears as a narrow band of material (see Chen, 2011) to halo CMEs, where material appears to be coming from the whole Sun (Howard et al., 1982). CMEs also vary in brightness, and these differences can be used to estimate the mass of a CME (e.g., Colaninno & Vourlidas, 2009), although CME mass estimation is difficult without a three-dimensional view of the CME (De Koning, 2017;Hutton & Morgan, 2017). Hutton and Morgan (2015) estimated the masses of 133 CMEs in coronagraph images, linking the structure of the CMEs observed to the corresponding eruption seen in extreme-ultraviolet images. They concluded that there are two different types of CMEs: brighter CMEs with a three-part structure and a comparatively high mass, and dimmer unstructured CMEs with lower masses.
In in situ data, ICMEs can be identified by a range of characteristics, as ICMEs often do not always show the same signatures. However, it is widely agreed that a subset of ICMEs called magnetic clouds can by identified by three factors: a smooth rotation of the magnetic field, an enhancement of magnetic field greater than 10 nT, and low proton temperatures (Zurbuchen & Richardson, 2006).
There is a large gap between when CMEs are observable by coronagraphs, and detectible as ICMEs; during which CMEs interact with the ambient solar wind (Gopalswamy et al., 2000). The Solar Mass Ejection Imager (SMEI) on-board the Coriolis spacecraft (Eyles et al., 2003) was the first attempt to observe CMEs in this region. The second attempt was made when the STEREO spacecraft were launched in 2006 (Kaiser et al., 2008); two near identical spacecraft carrying the Sun Earth Connection Coronal and Heliospheric Investigation (SECCHI) package (Howard, Moses, et al., 2008). This contains an extreme-ultraviolet imager (EUVI), which observes the solar disk, two coronagraphs (COR1 and COR2) which observe from 1.5 to 4 and 2.5 to 15 solar radii respectively, and two heliospheric imagers (HI-1 and HI-2), which observe from 4°to 24°and 18.7°to 88.7°elongation (or approximately 15 to 84 and 66 to 318 solar radii) . Thus, the STEREO spacecraft were the first to continuously observe CMEs from the Sun out to 1 AU. STEREO's Heliospheric Imagers observe visible light, Thomson scattered by free electrons (Billings, 1966). The raw images are dominated by light scattered by dust (known as the F-corona). This excess signal must be removed through background subtraction, which identifies and removes parts of the image, which remain constant over time. This allows moving features such as CMEs to be observed . Each point in the field of view represents all the material seen along that line of sight. Thomson scattering is minimized at 90°from the source of intensity (from the Sun), and therefore, electrons on the Thomson sphere, a sphere with diameter between the observer and the Sun, should appear less bright. However, in any given line of sight, the material on the Thomson sphere is also the material closest to the Sun. As the density of electrons is highest near the Sun, objects closer to the Sun appear much brighter. Therefore, material in the area near the Thomson sphere, known as the Thomson plateau, appears brighter in HI images (Howard & DeForest, 2012). These images have been well used to estimate CME speeds by stacking up image slices into time-elongation maps (also known as J-Maps) (e.g., Sheeley et al., 1999, Möstl et al., 2017, andreferences therein). However, individual HI images, which contain a wealth of information, have been relatively overlooked.
Studies that have made use of these images include Viall et al. (2010) who identified periodic structures at scales of around 1,000 Mm, which matched to periodicities in in situ data at 1 AU; DeForest et al. (2016) who applied image processing techniques, observing a change in image texture associated with the solar wind transition from being magnetically dominated in the corona, to being governed by hydrodynamics in the heliosphere; and Barnard et al. (2017) who found that asking citizen scientists to track CME storm fronts through HI images, resulted in a more stable CME tracking than an expert using a J-Map. More recently; Barnard et al. (2019) found that variability in the solar wind seen in HI-1 images is correlated with in situ solar wind speed data; Scott et al. (2019) investigated why some CMEs appear to have multiple storm fronts in HI images, and Wharton et al. (2019) incorporated HI images into the CME Analysis Tool used by space weather forecasters.

10.1029/2020SW002556
Space Weather Figure 1 shows an example CME (15 June 2011) in a running-differenced HI-1 image from STEREO-A. HI images are running-differenced to make the structure of the CME clearer . This means that the previous image is subtracted from each image. Lighter parts of the image can be interpreted as an increase in the mass along the line of sight between the two images and darker parts of the image can be interpreted as a decrease in the mass along the line of sight between the two images. Therefore, a structure expanding outward from the Sun will appear as a bright front followed by a dark shadow. As you can see in Figure 1, CMEs appear to have at least one bright leading edge, named the "storm front." It is possible for CMEs to have multiple storm fronts; as the CME is actually a 3-D structure projected into a 2-D plane, these are likely to be different parts of the same CME storm front, and so the extra visible fronts have been named "ghost fronts" .
CMEs vary dramatically in these images; the brightness of the CMEs varies depending on the mass and density of the CME, and how near the CME is to the spacecraft; and CMEs exhibit a wide range of angular widths . Some CMEs resemble the lightbulb shape often observed in coronagraph images (e.g., the middle panel of Figure 4; Wood et al., 2011), and many studies have claimed to identify the CME front, cavity and prominence (e.g., Davis et al., 2009), but in many cases, the CMEs show no such structure, instead appearing to contain complicated and messy patterns (e.g., the right panel of Figure 4). Some CMEs have storm fronts, which appear distorted, showing convex or concave structure (Savani et al., 2010). While these visual differences are easily observable by the human eye, quantifying these differences is a challenging task. Therefore, a citizen science project was created in which volunteers helped identify visual differences between CMEs observed in HI-1 images.

Citizen Science
Citizen scientists have proven invaluable to scientific progress, both to the study of CMEs (through the Solar Stormwatch project; Barnard, Scott, Owens, Lockwood, Tucker-Hood, et al., 2015), and the wider scientific community. Projects such as Galaxy Zoo (Lintott et al., 2008), in which participants classify galaxies according to their shape, and Aurorasaurus (MacDonald et al., 2015), where participants report sightings of the aurora, have attracted thousands of participants and resulted in numerous scientific publications.
Citizen scientists are genuinely interested in contributing to science (Raddick et al., 2010(Raddick et al., , 2013, and many citizen scientists working together on a problem can achieve much more than an expert working alone. For example, in the first year after Galaxy Zoo was launched, more than 50 million classifications were made, representing a huge amount of data analysis. Another interesting benefit of citizen science is that humans naturally keep an eye out for the unexpected. In Aurorasaurus, participants report when they observe aurora, and these reports are combined with weather and solar wind measurements, to create a real-time nowcast of auroral visibility. This led to the discovery of a new type of aurora, named "STEVE," which was identified by amateur astronomers discussing what they had seen on the Aurorasaurus platform (MacDonald et al., 2018).
The Solar Stormwatch project was created by the Rutherford Appleton Laboratory, the Royal Observatory Greenwich, and Zooniverse. The original project contained six different activities designed to analyze several different CME datasets from the SECCHI imagers on the STEREO spacecraft. In "What's that?" participants were asked to watch movies of HI data, and spot whether they saw anything unusual, including dust impacts. Davis et al. (2012) used this data to investigate the distribution of dust hitting the STEREO spacecraft. In "Spot" and "Incoming spot," participants watched movies of HI science data, and HI beacon data, respectively, and recorded when CMEs occurred. In "Trace-it!" and "Incoming trace-it!" citizen scientists traced these CMEs through HI-1 and HI-2 data using J-Maps, resulting in a catalogue of CMEs (Barnard, Scott, Owens, Lockwood, Crothers, et al., 2015;Barnard, Scott, Owens, Lockwood, Tucker-Hood, et al., 2015;Tucker-Hood et al., 2015). Finally, in "Track-it-back," participants systematically tracked and characterized CMEs in images from the EUVI, COR1, and COR2 imagers. This resulted in a paper showing that the CMEs studied appeared to deflect toward the heliospheric current sheet (Jones et al., 2018). In addition to these six activities, the citizen scientists were asked to look out for images of "circular storms" in HI data, which Savani et al. (2012) used to investigate how the structure of these CMEs changes as they propagate away from the Sun.
Tracing CMEs through coronagraph and HI data is extremely subjective: for example, De Koning (2017) found that the mass estimate of a CME changed dramatically when different experts traced the CME, leading to a 50% uncertainty. However, many citizen scientists can be asked to trace the same CME feature, allowing the uncertainty to be estimated from the distribution of the observations. This makes CME catalogues produced from citizen scientist observations less subjective than those created by experts working alone.

A New Citizen Science Project to Study CMEs
Based on the success of the Solar Stormwatch project, we created a new project with the Science Museum to accompany a new touring exhibition about the Sun. Called "Protect our Planet from Solar Storms", participants looked at running-difference HI images of CMEs from both STEREO spacecraft. They were shown two images of CMEs side-by-side and were asked to decide which one looks the most complicated, in order to create a ranking of how complicated, or complex CMEs appear in HI images. In section 2 we describe how the project was created, the data used and how we created our ranking of CMEs. In section 3 we present our results, and in section 4 we discuss those results.

Images
To examine the complexity of CMEs in Heliospheric Imager data, we created snapshots of 1,111 CMEs observed by the STEREO spacecraft between 2007 and 2016. For each CME, we created a running differenced HI1 image at the time the CME reached halfway through the field of view (around 14°elongation). The running-differenced data were normalized, so that every CME image used the same scale, and plotted using the Python matplotlib gray color map. Running-differenced images were used to make sure the structure within the CME was visible, and the times were found using time-elongation tracks from the HELCATS catalogue (EU HELCATS et al., 2018). STEREO-B images, and STEREO-A images after the spacecraft was reoriented following its passage behind the Sun relative to Earth, were rotated by 180°to ensure that the CMEs all appeared to be moving from right to left. This was done to ensure there was no subconscious bias introduced by the orientation of the CME in each image.
Some of the images contained various artifacts, such as planets and comets, which varied over the time period studied in this analysis. Features such as planets are so bright that they cause signal to bleed into adjacent pixels along the same column in the image detector. This can be seen in all three panels of Figure 2. Other artifacts include dust trails (see Figure 2a), missing data blocks (Figure 2b), ghost rings, and comets ( Figure 2c). Before being loaded to the project, all the images were manually checked for artifacts, and any CMEs which were completely obscured were removed. However, images where the CME was still visible, such as those in Figure 2, were kept. Example running-differenced HI-1 images. Image (a) shows a CME with dust trails and a planet, causing a vertical black line across the image. Image (b) also contains a planet, as well as four blocks of missing data. Image (c) contains a comet.

The Citizen Science Project
The citizen science project, "Protect our Planet from Solar Storms," was created by the University of Reading and the U.K. Science Museum, using the open-source citizen science platform made available by the Zooniverse team (Zooniverse.org). Through this web-based interface, participants were served randomized pairs of CME images and asked to identify which CME looked most complicated (see Figure 3 for a screenshot of the activity). Through discussion with UK Science Museum outreach experts, the wording "which of these two storms looks the most complicated?" was chosen. Participants were given a brief tutorial, showing example CME images, a copy of which is available on Figshare (https://figshare.com/s/923a67b2d974de-d4a2c3). The tutorial suggested that simpler storms looked like "bubbles," whereas complicated storms had more internal structure. However, no further guidance was given as to the definition of "complexity." Each pair of images was classified independently 12 times so a robust decision could be determined.

Paired Comparison Data
To compare each of the 1,111 CME images with every other CME image would have required 616,605 unique comparisons. This would have involved a lot of work by the citizen scientists. Fortunately, not every comparison is needed to create a ranking of this type (David, 1988). Therefore, we chose to ask citizen scientists to compare around 3% of the total comparisons (20,190 unique comparisons). Each unique comparison was classified by 12 participants, and this was completed in around 5 months.
We generated the paired comparisons by editing the algorithm presented in Figure 6 in Miranda et al. (2009). This shows that all possible unique paired comparisons can be generated by looping over the items to be compared many times. In the first loop, each item is compared with the next item in the list. In the second loop, each item is compared with the item two places further on, and so on. The total number of unique loops for n items is n − 1 2 . When the number of items to compare is even, the number of unique loops includes a half loop, which corresponds to the loop where the total items can be split into half, and the first item of the first half is compared with the first item of the second half, etc.-in this loop the comparisons are repeated, so only half of them are unique.
We first uploaded images of CMEs, which occurred between 2007 and 2013. There were 971 CMEs, meaning that there were 485 possible unique loops. We included comparisons from every thirtieth loop (16 loops in total), starting with the first loop, and therefore each CME was compared with 16 other CME images. As each loop contains 971 paired comparisons, the total unique paired comparisons we showed to participants was 16 * 971 ¼ 15,536.
We then added images of CMEs, which occurred between 2014 and 2016. This was done separately as we did not readily have the data for these events when the first images were uploaded. There were only 140 CMEs recorded in the HELCATS catalogue with time-elongation profiles during this time. This time choosing the Users were asked to identify which CME looked the most complicated by selecting the appropriate button on the panel at the right-hand side of the interface.
paired comparisons was more complicated, as we needed to compare these CMEs with the CMEs, which occurred between 2007 and 2013, as well as with each other. We chose to find comparisons for every thirtieth loop of all the 1,111 CME images, starting from the first loop (18 loops in total), and then to remove comparisons where both CMEs occurred between 2007 and 2013. CMEs between 2014 and 2016 were therefore compared with 18 other CME images, resulting in 4,654 more comparisons.

Ranking Creation
To turn the paired comparison data into a ranking of perceived CME complexity, we fitted a Bradley-Terry model (Bradley & Terry, 1952;Thurstone, 1927), which is widely used to analyze paired comparison data. This model assumes that in one comparison between two objects (in this case two CMEs) i and j, the odds that i beats j are α i /α j , which are parameters describing the worth of each object. The worth of the object numerically describes the parameter considered during the paired comparisons. For example, if people were choosing which ice cream looked tastiest, the worth of the ice cream would numerically describe how tasty it looked, in comparison to the other ice creams. This model can be expressed as This is a special case of logistic regression, and therefore, the parameters {λ i } can be estimated by maximum likelihood. However, to calculate the model parameters a constraint is needed. Here we use the reference object constraint, where one object is assumed to have a worth of 0. Hence, the model parameters of the remaining objects can be calculated as their relative worth in comparison to the reference object. These parameters can then be used to rank the objects by their worth; in this case, rank the CMEs by their relative complexity.
We used the R BradleyTerry2 package (Turner & Firth, 2010) to fit the Bradley-Terry model to the data. In this package, the reference object was assigned as the first CME in the list of data. Changing the reference CME simply adjusts the scale of relative complexity, so the reference CME has a relative complexity of 0.
The Bradley-Terry model works by assuming that each of the paired comparison results are independent (Cattelan, 2012). This is not completely true here, as many of the volunteers who participated in the citizen science project completed more than one comparison. However, 4,028 volunteers took part in the project, completing 246,692 comparisons, so on average each participant completed only 12 comparisons. In reality, the number of classifications completed ranged from the one user who completed 16,254 classifications to the 325 users who completed only one classification each. Although we recorded 4,028 volunteers, only 2,212 were logged into Zooniverse accounts, meaning that the other 1,816 volunteers may not be different volunteers, if, for example, the same person participated without logging in on multiple occasions, or participated without logging on, then made an account, and continued to participate using their account. However, we believe that the effect of assuming independence of observations is likely to be small, in this case.

Results
Having ranked the CMEs in terms of their subjective complexity, they could then be place in ranked order. Figure 4 shows three example CMEs from the ranking, demonstrating CMEs with characteristically low (left), intermediate (center) and high (right) relative complexity values. There are distinct visual differences between these three CMEs. The low complexity CME is faint with little structure, whereas the high complexity CME is bright, with a large angular width and many intricate structures visible within its internal structure. An animation showing the full ranking of 1,100 storms, along with their relative complexity values, is available online (https://figshare.com/s/4e44ec2f4aaa17b5fbcd). Using both a t test and the Kolmogorov-Smirnov test, we found that CMEs observed by STEREO-A were significantly more complex than CMEs observed by STEREO-B, using a significance level of 1%. To investigate these differences further, we looked at the relative complexity of CMEs from STEREO-A images against the relative complexity of those same CMEs found from STEREO-B images. For 149 of the 188 CMEs observed  by both spacecraft, which were identified using the list provided in the HELCATS catalogue (EU HELCATS et al., 2018), the CME was found to be more complex in the STEREO-A image. The mean relative complexity of these events was found to be significantly different between STEREO-A and STEREO-B images, at a significance level of 1%, using the dependent t test for paired samples.
We also investigated whether there was evidence of the solar cycle trend in the raw data from the project, without fitting the Bradley-Terry model. Figure 6 shows the percentage of "wins" by year; we define a "win" where at least 7 of the 12 participants who looked at the paired comparison chose the image from that year as "more complicated." We show the percentage to account for the different number of CMEs in each year. Paired comparisons where both images were from the same year were excluded. Figure 6 clearly shows that the winning image was from STEREO-A (pink) more often than STEREO-B (blue) and that the percentage of wins varies over the solar cycle in the same way as the annual average visual complexity values.

Solar Cycle Trends in the Literature
It has been long known that the Sun has an approximately 11-year cycle, known as the Schwabe cycle, most noticeably in the number of sunspots present on the Sun (Schwabe, 1843), but there are a whole range of solar phenomenon which show similar cyclic behavior (see review by Hathaway, 2015, for a more complete description of the solar cycle). Both the Sun's magnetic field (Babcock, 1959) and sunspots on the Sun's surface (Hale et al., 1919) change polarity between solar cycles, and the latitudes at which sunspots are present on the Sun vary through the cycle (Maunder, 1904). Changes in solar irradiance through the solar cycle can influence the Earth's climate (see review by Haigh, 2007), and the structure of the solar wind changes dramatically between solar maximum and solar minimum (McComas et al., 1998(McComas et al., , 2003. During solar minimum, the solar wind appears to have a clearer bimodal structure; the solar wind from the Sun's equator tends to be slower and variable, but above this band and toward the Sun's poles, the solar wind is generally faster and more constant (around 750-800 km/s). However, during solar maximum the situation is much more complicated; the Sun's corona appears highly complex (e.g., Morgan & Cook, 2020); and the solar wind varies between fast and slow and all latitudes. Observed trends in the frequency of CMEs (e.g., Gopalswamy et al., 2009;Harrison et al., 2018;Robbrecht et al., 2009) and solar flare occurrence (e.g., Kossobokov et al., 2012) also follow the trend in solar activity and sunspot number.
Some studies have also found solar cycle variation in CME properties. Howard, Nandy, et al. (2008) found that near solar minimum, CMEs tend to originate from the Sun's equatorial region, but as the cycle progresses, CMEs can come from all latitudes. Yashiro et al. (2004) presented a catalogue of CMEs observed in LASCO coronagraph images, finding that the average apparent width and speed of CMEs increased from 47°and 281 km/s at solar minimum (1996) to 61°and 499 km/s at solar maximum (1999). Gopalswamy (2016) plotted mean and median apparent speeds of CMEs observed by SOHO in the CDAW CME catalogue between 1995 and 2015, finding that the trend in average speeds matches the variation in the sunspot number. Petrie (2015) looked at CME velocities found from the CDAW, SEEDS, and CACTus CME catalogues, finding that although there were differences between the catalogues (as they were created using different methods; SEEDS and CACTus are different automated algorithms applied to coronagraph images; and the CDAW catalogue was created by inspection of the data by experts), all three showed similar solar cycle variation, which followed the sunspot cycle. Vourlidas et al. (2010) found similar, if slightly less clear, trends in their estimates of CME mass and kinetic energy from LASCO coronagraph data.
More recently, solar cycle variation has been observed in CMEs identified in HI images through the HELCATS project. Harrison et al. (2018) found that the median angular width of CMEs observed in HI-1A increases from around 50°in solar minimum (2008) to 85°in solar maximum (2012) and then decreases to 55°near the next solar minimum (2016). They also found similar trends in CMEs observed by HI-1B and COR-2A, COR-2B, and LASCO coronagraphs. Barnes et al. (2019) tracked the HELCATS CMEs through J-Maps, applying various geometric-fitting techniques to the data to estimate CME speed and direction. They found that the median speeds of CMEs observed in HI-1A and HI-1B increase from around 400 km/s in 2008 to 550 km/s in 2012, and the average speeds from HI-1A decrease to around 500 km/s in 2016. This solar cycle variation can also be observed in ICMEs found using in situ data; Richardson and Cane (2004) found that the percentage of ICMEs identified as magnetic clouds decreases as solar activity increases; and Nieves-Chinchilla et al. (2019) observed that the magnetic field configuration within ICMEs varies with the solar cycle.

What Were the Citizen Scientists Observing as Visual Complexity?
If the participants of Protect our Planet from Solar Storms had chosen which CMEs they thought were most complicated randomly, their choices would follow a binomial distribution. 23% of the comparisons would have been ties, but less than 0.1% of the results would have been unanimous (see the dark blue bars in Figure 7). However, only 5% of the comparisons shown resulted in ties, and the result was unanimous in 18% of the comparisons (see the light blue bars in Figure 7). This suggests that the participants were not choosing randomly, and moreover, they were agreeing with each other often, implying that they were using the same criteria to make their decisions. As a further check, the authors of this paper examined the complexity ranking (as demonstrated in Figure 4) and observed that the CMEs appear to change in a consistent way throughout the ranking. So, what is it these citizen scientists were observing as visual complexity? How were they making their decisions?
Through the online forum associated with this project, participants were able to interact with the project team and each other. When asked what they considered as a complex or complicated CME, participants described the following characteristics; "bigger"; "brighter"; "complicated patterns"; "messiest"; "least lightbulb-like shape"; "without a clear front"; "multiple fronts"; "more white regions." To investigate the physical differences between complex and "not complex" CMEs, it will be necessary to objectively define "visual complexity" in terms of quantitative differences between CME images. This will be the subject of a future study. Here, we investigate whether the citizen scientists might have been influenced by factors unrelated to the CMEs and how CME brightness and CME size contribute toward visual complexity.

Factors Unrelated to CMEs
It is possible that some participants exhibited a bias in their selection of the left or right image, which is particularly likely in situations where both images are similar (Englund & Hellström, 2013). To investigate this, we calculated the number of times for which the left and right image won (i.e., at least 7 people out of 12 chose the left or right image as "more complicated"). In total, the image on the left won 9,941 times (49%) and the image on the right won 9,361 times (46%). This suggests there may be a very small bias toward the left image, but as it is only a few percent, we consider this unlikely to be significant.
As mentioned in section 2.1, the CME images contained various artifacts, such as comets and planets. To investigate whether these artifacts biased the citizen scientists, we plotted visual complexity against the number of pixels in each image recorded as missing, which can be interpreted as the amount of the image obscured by artifacts. We found that there was no trend between complexity and missing pixels and therefore concluded that these artifacts were unlikely to have caused any bias in the ranking.

Is Visual Complexity Influenced by the Size of the CME?
The HI images used in the project were all chosen so that the CME storm front was approximately half-way through the field of view (14°elongation), so that all CMEs were presented at a similar phase of their expansion. However, the angular width of the CMEs varied, so it is possible that the citizen scientists were simply choosing the biggest CME as the most complicated. To investigate this, we first plotted the angular width of the CME in HI-1 images (found from the HELCATS catalogue) against the visual complexity of each CME. Figure 8 clearly shows that there is a correlation between angular width and visual complexity. The

Space Weather
Spearman's rank correlation coefficients for this plot were 0.60 and 0.44 for STEREO-A and STEREO-B CMEs respectively, with p values of 1 × 10 −62 and 9 × 10 −24 . It is therefore clear that the angular width and visual complexity of CMEs are related. This is unsurprising, as previous studies, such as Harrison et al. (2018) have found that the angular width of CMEs varies with the solar cycle.
Next, we split the CMEs into four groups, corresponding to the terciles of the width distribution, to consider CMEs with similar angular width separately. Figure 9 shows the visual complexity of CMEs over time for each of the four groups; CMEs with angular widths of 50°or less; CMEs with angular widths greater than 50°, but less than 75°; CMEs with angular widths greater than 75°but less than 110°; and CMEs with angular widths great than or equal to 110°. As you can see in Figure 9, each subset shows that the visual complexity varies over the solar cycle in the same way as Figure 4. We therefore conclude that the trend in visual complexity over time for all CMEs (shown in Figure 4), cannot be caused by variations in angular width alone, and so, the citizen scientists must have also been looking at other CME characteristics to make their decisions.

Is Visual Complexity Influenced by CME Brightness?
The three CMEs shown in Figure 4 demonstrate how the brightness of CMEs varies in HI images. Figure 4 (left: least complex) has only one small white region, whereas Figure 4 (right: most complex) has a large white storm front as well as multiple white regions within the CME. This might suggest that the citizen scientists simply chose the CME, which appeared brightest as the most complicated. We devised two simple tests to explore whether this was the case.
First, we counted the number of bright pixels in each image and compared this to the visual complexity of each image (see Figure 10). The brightest 12.5% of pixels were considered bright pixels. As wider CMEs cover more pixels than narrower CMEs, and therefore contain more bright pixels, we normalized the number of bright pixels in an image by the angular width of the CME in the image. Hence, we divided the number of bright pixels by the total number of pixels within the CME area, to find the fraction of bright pixels in the CME area. Spearman rank correlation coefficients between angular width and bright pixels were +0.63 and +0.64 for STEREO-A and STEREO-B images, respectively, suggesting that the citizen scientists were influenced by the brightness of the CME.
Second, we applied histogram equalization to all the CME images, and repeated the experiment; loading the new images to the project and asking citizen scientists to choose which image looked the most complicated. Histogram equalization changes the distribution of pixel values within an image. A histogram of pixel values of a differenced HI image with a CME contains a peak around pixel value 128 (where the color map is gray, where there is no difference between the two images which were subtracted to create the running differenced image), and two small peaks around pixel values 0 and 255 (where the color map is black and white, Figure 10. Number of bright pixels in each image, versus the relative visual complexity of the CME in the image. Again STEREO-A storms are shown in pink, and STEREO-B storms in blue. The color bar on the right shows the pixel values of the images which we counted as bright. Figure 11. This shows the same three CMEs as shown in Figure 4, after they have been brightness equalized.

10.1029/2020SW002556
Space Weather respectively, in regions where the two HI images used to create the running differenced image were very different). Histogram equalization flattens this histogram to make all pixel values similarly likely, thereby making every image contain a similar number of bright pixels. Figure 11 shows the same three CMEs as Figure 4, with histogram equalization applied.
We fitted the Bradley-Terry model to this new dataset and plotted the resulting visual complexity values against time (see Figure 12). Figure 12 shows that the visual complexity of histogram equalized images follows the same trend in visual complexity, and therefore, we conclude that the citizen scientists were considering more than the brightness of the CME when deciding which of the two images looked the most complicated.

Summary and Future Work
We have created a ranking of the visual complexity of 1,111 CMEs observed in images from the STEREO HI-1 cameras between 2008 and 2016. This was done by fitting a Bradley-Terry model to paired comparison data generated via a web-based citizen science platform, in which participants were shown pairs of CME images and asked to decide which one looked the most complicated. This ranking shows that the average complexity of CMEs observed by both STEREO spacecraft appears to follow the solar activity cycle, as represented by the sunspot number (see Figure 5). This demonstrates that there is a quantifiable change in the structure of CMEs seen in the inner heliosphere.
In general, we found that the visual complexity of CMEs in HI-1A images is significantly higher than for CMEs observed in HI-1B images. This may be due to slight differences between the two imagers; STEREO-B is affected by pointing errors, which may blur the structure within the CME, causing features of smaller scale sizes to be less apparent in HI-1B images. The authors will quantify the scale sizes of the visual features, which volunteers are identifying as complex to test this theory.
Investigation of the CME ranking showed that the citizen scientists were unlikely to have been biased toward the left or right image, or artifacts within the images. Visual complexity was found to be correlated with both the angular width of the CME and the number of bright pixels within the image. However, when we split the CMEs into groups of similar angular widths, and repeated the experiment using brightness equalized images, we found that the trend in CME visual complexity remained, suggesting that participants based their choices on more than the brightness and angular width of the CME.
Whatever the citizen scientists were observing, the question remains: What is causing these changes? Perhaps there are more eruptions from active regions at solar maximum, resulting in larger and more complex CMEs. Perhaps the complexity occurs after the CME has erupted, as it travels through the ambient solar wind and heliospheric magnetic field, which is more variable toward solar maximum. Or perhaps complex CMEs result from multiple CMEs colliding and merging into one. A future study will both seek to quantify the differences between complex and simple CMEs, and investigate why these differences occur.

Data Availability Statement
This publication uses data generated via the Zooniverse.org platform, development of which is funded by generous support, including a Global Impact Award from Google and a grant from the Alfred P. Sloan Foundation. The paired comparison data from "Protect our Planet from Solar Storms" can be found at https://figshare.com/s/7e0270daa8153bb0416e, and the code used is available at https://github.com/S-hannon/complexity-solar-cycle website. This research has made use of SunPy, an open-source and free community-developed solar data analysis Python package (The SunPy Community et al. 2015). The STEREO Heliospheric Imager data are publicly available from the UK Solar System Data Centre (www. ukssdc.ac.uk).