Volume 17, Issue 6 p. 845-860
Research Article
Free Access

Comprehensive Assessment of Models and Events Using Library Tools (CAMEL) Framework: Time Series Comparisons

Lutz Rastätter

Corresponding Author

Lutz Rastätter

Community Coordinated Modeling Center, Code 674, NASA GSFC, Greenbelt, MD, USA

Correspondence to: L. Rastaetter,

[email protected]

Search for more papers by this author
Chiu P. Wiegand

Chiu P. Wiegand

Community Coordinated Modeling Center, Code 674, NASA GSFC, Greenbelt, MD, USA

Code 550, NASA GSFC, Greenbelt, MD, USA

Search for more papers by this author
Richard E. Mullinix

Richard E. Mullinix

Community Coordinated Modeling Center, Code 674, NASA GSFC, Greenbelt, MD, USA

Code 587, NASA GSFC, Greenbelt, MD, USA

Search for more papers by this author
Peter J. MacNeice

Peter J. MacNeice

Community Coordinated Modeling Center, Code 674, NASA GSFC, Greenbelt, MD, USA

Search for more papers by this author
First published: 22 May 2019
Citations: 8


The Comprehensive Assessment of Models and Events using Library Tools (CAMEL) framework leverages existing Community Coordinated Modeling Center services: Run-on-Request postprocessing tools that generate model time series outputs and the new Community Coordinated Modeling Center Metadata Registry that describes simulation runs using Space Physics Archive Search and Extract metadata. The new CAMEL visualization tool compares the modeled time series with observational data and computes a suite of skill scores such as Prediction Efficiency, Root-Mean-Square Error, and Symmetric Signed Percentage Bias. Model-data pairs used for skill calculations are obtained considering a user-selected maximum difference between the time of observation and the nearest model output. The system renders available data for all locations and time periods selected using interactive visualizations that allow the user to zoom, pan, and pick data values along traces. Skill scores are reported for each selected event or aggregated over all events for all participating model runs. Separately, scores are reported for all locations (satellites or stations) and for each location individually. We are building on past experiences with model-data comparisons of magnetosphere and ionosphere model outputs from GEM2008, GEM-CEDAR Electrodynamics Thermosphere Ionosphere, and the SWPC Operational Space Weather Model challenges. The CAMEL visualization tool is demonstrated using three validation studies: (a) Wang-Sheeley-Arge heliosphere simulations compared against OMNI solar wind data, (b) ground magnetic perturbations from several magnetosphere and ionosphere electrodynamics models as observed by magnetometers, and (c) electron fluxes from several ring current simulations compared to Radiation Belt Storm Probes Helium Oxygen Proton Electron instrument measurements, integrated over different energy ranges.

Key Points

  • We present an interactive online data-model analysis tool that is being constructed at the CCMC
  • The online model-data analysis can span multiple events or observatories (of the same kind) to create aggregate skill scores
  • Interactive plots allow users to inspect numerical values during their analysis

1 Introduction

Model-data comparisons have been conducted at the Community Coordinated Modeling Center (CCMC) since 2008 when global magnetospheric models were run for selected geomagnetic storm events. The resulting outputs were compared against magnetic fields measured at GOES satellite locations as well as magnetometer locations on the ground (Pulkkinen et al., 2010; Rastätter et al., 2011). These studies explored model performance using a variety of skill scores, including Root-Mean-Square Error (RMSE), Prediction Efficiency (PE) for real data evaluations and threshold based (categorical or yes/no) evaluation metrics such as Probability of Detection (POD), Probability of False Detection (POFD), and Heidke Skill Score (HSS; Lopez et al., 2007; Pulkkinen et al., 2013).

Model validation efforts at CCMC also began in 2008 for the heliosphere where the impact of different magnetogram observations on Wang-Sheeley-Arge (WSA) heliosphere model (Arge & Pizzo, 2000) results near the Earth were studied (MacNeice, 2009a, 2009b). This study looked at hits and misses, timing errors, and minimum and maximum values of solar wind velocity jumps that identify crossings of the heliospheric current sheet.

Soon after these initial studies, observations in the ionosphere were collected to assess performance of ionosphere/thermosphere models in the CEDAR Electrodynamics Thermosphere Ionosphere (ETI) challenge (Shim et al., 2011, 2012). The GEM, CEDAR and SWPC study data and model outputs are available on the CCMC website utilizing a metrics visualization tool (called Metrics-Vis in this paper) that analyzes data based on a selection of a single event (pre-defined time interval), observatory and physical parameter (described in Section 3.1).

The space weather forecasting activities at the CCMC since 2008 have resulted in extensive work in validating modeling of Coronal Mass Ejection (CME) arrival times at Earth using single and ensemble modeling at the CCMC (Mays et al., 2015; Taktakishvili et al., 2009) and the accumulation of community model results in a CME Scoreboard application (Riley et al., 2018; Wold et al., 2018; Verbeke et al., 2019)

Up until recently most of the model validations at the CCMC were available through Metrics-Vis. However, the tool was unable to aggregate multiple events or multiple observation locations (e.g., magnetometer locations) to calculate average skills or rank models by a single or a combination of multiple skills as was done in, e.g. (Pulkkinen et al., 2013; Rastätter et al., 2014a, 2013; Shim et al., 2011, 2012). The new Comprehensive Assessment of Models and Events Using Library Tools (CAMEL) toolkit expands on the capabilities of Metrics-Vis by presenting multiple observation locations and events in the same interface to allow for aggregation of skill scores across multiple events and stations or spacecraft locations that have been selected.

In recent years, data-model evaluations are being pursued in the fields of radiation belt, ring current, global magnetosphere ultralow frequency wave modeling and magnetopause location studies, extending the need for skill scores to support an ever-widening variety of data (Morley et al., 2018). In order to meet the needs of the community worldwide, the CCMC is developing the CAMEL framework (described in section 2) to support model-data comparisons more effectively and efficiently. The CAMEL visualization front end is being demonstrated using three use cases: Ground magnetic perturbation data from 13 magnetometers are simulated by three global magnetosphere and two statistical ionosphere electrodynamics models (Pulkkinen et al., 2013) for six space weather events. We show CAMEL's capability to collect and average skill scores across multiple events and stations in section 4.1. WSA (Arge & Pizzo, 2000; MacNeice, 2009a) heliosphere simulations with different prediction lead times compared to hourly OMNI2 (King & Papitashvili, 2005) solar wind data. We use CAMEL for a case where no events are defined and calculate skills for each month of a yearlong interval. Finally, RAM-SCB (Jordanova et al., 2010; Yu et al., 2011, 2019) and CIMI (Fok et al., 2014) ring current simulations are compared to Helium Oxygen Proton Electron (HOPE) mass spectrometer observations (Funsten et al., 2013; Spence et al., 2013) of electron fluxes on the Radiation Belt Storm Probes (RBSP or Van Allen Probes) for two geomagnetic storm events in section 4.3.

With the ring current data we employ the skill scores that equally penalize underestimations and overestimations of data by using the (Base-10) logarithm of the data to calculate scores like the Symmetric Signed Percentage Bias (SSPB; Morley et al., 2018).

Model-data validation efforts have been essential to terrestrial weather modeling where the Model Evaluation Tools (METs; ; Adriaansen et al., 2018; Halley Gotway et al., 2018) are being maintained in the Development Testbed Center (dtcenter.org) at the National Center of Atmospheric Research (NCAR). MET can calculate skill scores from time series, similar to efforts done in space weather research, and also performs pattern matching in 2-D and 3-D gridded data and model outputs (Davis et al., 2006a, 2006b; Prein et al., 2017; Wolff et al., 2014). CCMC has started a collaboration with NCAR to incorporate components of MET into the CAMEL framework in the future. Observations from two-dimensional images will be compared to gridded model output data in the future using tools in MET.

2 CAMEL Framework

The CAMEL framework is an integrated and flexible framework allowing users to seamlessly compare model outputs with similar observational data sets. The back end of the CAMEL framework takes advantage of existing services and will utilize newly-developed tools at the CCMC as they mature. The front end interactive visualization tool powered by the back end infrastructure allows users to plot model outputs and observation data sets together while providing options to calculate various skill scores from our library. In addition to driving the CAMEL interactive visualization tool presented in this paper, the CAMEL framework will provide an Application Programming Interface (API) allowing users to download all data sets available on the CAMEL back end database into their own environment for further analysis.

2.1 Leverage Existing CCMC Services

This is the list of existing CCMC services and systems that the CAMEL framework utilizes:
  1. CCMC Metadata Registry (CMR; https://kauai.ccmc.gsfc.nasa.gov/CMR) contains metadata for CCMC models, model runs, and numerical and nonnumerical (e.g., image) outputs. The metadata data model is Space Physics Archive Search and Extract (SPASE) compliant (Roberts et al., 2018) allowing our system to easily connect to the SPASE registry for their registered observation data sets. All metadata for data sets stored in the CAMEL back end database are first registered in CMR. For model-data comparison and validation purposes, it is essential that the CAMEL framework has and can provide proper metadata regarding its observation data as well as model output to the users. We are continuing to expand the SPASE metadata model to support model chains and coupled frameworks that have become very common in the Space Physics community. We have regular weekly interactions with the SPASE group. Recently, our suggestions to add Developer, User, and HostContact as new roles were adopted by SPASE to better describe how the CCMC staff fit in support of the models we host. The SPASE hierarchy describes the model itself, its execution, and all input parameters and relevant run environment, and all output from running the model. Model type, region of space, and time interval are the primary parameters used to discover results, with use of more detailed metadata planned. Other suggested changes to SPASE are expected as we work to describe more complex simulations and their output.
  2. Run-on-Request (RoR) system: The CCMC RoR system contains thousands of model runs and forms a wide archive of their outputs. One of the future goals of the CAMEL framework is to be able to utilize the existing RoR archive to validate models for the community. Currently, RoR runs are processed by hand and manually imported into the Metrics-Vis or CAMEL systems. CAMEL exists independently from RoR runs once their outputs have been ingested into CAMEL's database tables. In order to import RoR outputs automatically, a new next-generation RoR system (RoR-NextGen) is being designed and developed that will automatically register and access simulation metadata in the CMR. Therefore all existing and future model runs and output from the RoR system will contain proper metadata that is SPASE compliant. When RoR-NextGen has matured, the CAMEL framework will be able to connect to and search the archive of RoR runs at the CCMC allowing users to find and use all relevant output for their validation efforts.
  3. Postprocessing tools: these include existing tools available to a user at CCMC such as the (online) visualization (CCMC-Vis) that performs extraction of time series data along satellite trajectories or at locations on the ground or in space. CalcDeltaB (Rastätter et al., 2014b) calculates magnetic perturbations at magnetometer stations from electric currents in the magnetosphere and ionosphere (available as a postprocessing run request applied to any magnetosphere model simulation). Other tools at the CCMC include the internally developed and used Flexible Data Ingestion Tool, a generic data parser that reads time series data and stores them in a database table.

2.2 CAMEL Back End

The CAMEL back end infrastructure is designed and developed with flexibility in mind. The CAMEL database contains tables describing each validation study and captures information about the parameters that are relevant to a validation study (in terms of observation data and locations). Information regarding various validation studies are also captured in the database along with available predefined time intervals when applicable. All data sets are mapped to their appropriate SPASE descriptions identified by their SPASE ID in CMR. This allows the CAMEL back end to get detailed metadata regarding its data sets from CMR when necessary. Communication between the CAMEL back end and the interactive front end is implemented via a web service interface. The web service interface can also be used by other applications if users prefer to build their own visualization tools using the CAMEL data sets. It also provides an option for users to download and analyze data sets into their own favorite tools/framework.

One of the most powerful and crucial components of the CAMEL back end infrastructure is the library of skill scores, which may be applied to any set of observation-model-output pairs. Skill scores are single numerical values resulting from a summation of applicable samples within a chosen time range (event) or list of events. Samples are observation times where model output values were either matched in time or were close enough to be interpolated (given a tolerance entered by the user). Skill scores are tabulated for the events, simulation runs and spatial zones selected with a visualization. Currently, the CCMC is looking into incorporating the powerful METs from NCAR into the CAMEL framework. This would offer the space science community more calculations and formulas that the terrestrial weather community is already employing.

Before skill scores are calculated, data interpolation and filtering options are being applied. Interpolation options implemented are nearest neighbor and linear interpolation. Filters will apply smoothing and consider userselected minimum and maximum values for either observation or model results to be accepted in skill score calculations.

The back end (database server) runs at CCMC and is currently only available through the CAMEL front end run on a CCMC web server or a development server within the same private network. An API has not been formalized. The database server will be opened to outside networks once an API based on accepted web standards is defined and documented.

2.3 CAMEL Front End

The CAMEL front end focuses on the interactive visualization (described in the following section) that align observation data and model results at common times. Visualizations compare observations with data from one or more simulation runs or empirical model specifications and may include locations from multiple, similar instruments or satellites. Examples include magnetometers that can be divided into high-, middle-, and low-latitude locations or spacecraft of a series such as DMSP, GOES, or LANL that are located in different local time sectors or distances from Earth. Users can also select the skill scores that they would like to calculate based on their selected data sets. The front end of CAMEL also provides options for users to filter out data gaps and outliers if applicable. The CAMEL front end also offers a choice of interpolation method (nearest neighbor, linear interpolation) to use for model-data comparison as well as skill scores calculation. In this paper we present the advances in visualization and skill score calculations (with support from CMR in item 1 in section 2.1).

3 Data-Model Comparisons and Event Assessments: Visualizations and Skill Scores

3.1 Old Online Time Series Visualization and Skill Score Calculation Tool

An online visualization and analysis tool has been available at the CCMC website via a run_metrics_vis.cgi interface since 2008, later referred in this paper as “Metrics-Vis.” The tool is available on the CCMC website (https://ccmc.gsfc.nasa.gov). On the “Metrics and Validation” page the user can select one of several challenges including the “GEM Challenge,” “CEDAR ETI Challenge,” or “GEM-CEDAR Challenge.” On each of those challenge pages, a “Time series plotting Tool” link directs the user to a table of visualization links utilizing the tool (e.g., “Time series plotting tool (ionosphere/thermosphere)” points to https://ccmc.gsfc.nasa.gov/challenges/GEM-CEDAR/plotting_tool_ion.php or “Time series plotting tool (magnetosphere)” points to https://ccmc.gsfc.nasa.gov/challenges/GEM-CEDAR/plotting_tool_mag.php from the GEM-CEDAR Challenge page). Each link on those tables provides a value to the tool for these four basic variables:
  1. Campaign: the name of the validation study (e.g, GEM2008, CETI2010, SWPC2010, and Space Radiation and Plasma Effects);
  2. Observatory: the name of the spacecraft, magnetometer, or other observatory/instrument used to obtain the observational data (e.g., GOES-10, RBSP-A, and YKC);
  3. Event: the selected time interval from a list of available predefined events included in the study;
  4. Metrics: the designation of a observational parameter and of applicable skill scores.

With these four variables the interface presents the user with a list of simulation runs that have been included as time series data into a file tree at the CCMC that can be visualized (see, e.g., Figures 1 and 2).

Details are in the caption following the image
Community Coordinated Modeling Center (CCMC) time series comparison tool interface (“Metrics-Vis”) applied to electron fluxes observed at Radiation Belt Storm Probes spacecraft (https://ccmc.gsfc.nasa.gov/cgi-bin/run_metrics_vis.cgi?study=Plasma_Rad_Effects&event=1&metrics=2&obs=RBSPA-HOPE).
Details are in the caption following the image
Metrics-Vis plot of electron fluxes at the RBSP-A spacecraft from HOPE instrument observations and SWMF-RAMSCB and stand-alone RAMSCB and CIMI model runs. The table of skill scores was generated with the plot by Metrics-Vis. RBSP = Radiation Belt Storm Probes; HOPE = Helium Oxygen Proton Electron.

The user now can select the runs to include in the plot (traces shown in Figure 2, top) and table of skill scores (applicable scores displayed in Figure 2, bottom). Shell scripts that automate the plotting and skill score calculations across multiple events and locations may exist but they have been developed by the lead author for each study and are not available through CCMC to outside users. With CAMEL the aggregation of skill scores is now available within a single instance of the online interface where multiple events and observation locations can be selected before calculating scores. With the addition of observation data and model outputs in the future, CAMEL will apply to more Space Physics domains and provide more validation techniques in the future. Progress and results so far are described below.

3.2 New CAMEL Interactive Visualization and Skill Scoring

The web application user interface for CAMEL, available through https://ccmc.gsfc.nasa.gov/camel/(Figure 3), provides applications for preselected values for Campaign and Metrics (the first and last of the four variables introduced in section 3.1). With the two variables selected, CAMEL retrieves the corresponding events, available simulation results, observatories and parameters names from the CAMEL database. The first row of controls in the interface (Figure 3) presents the user with a choice of events, simulation model runs, and observation locations. The second row then features choices of parameter, skill score, interpolation method, and maximum data gap.

Details are in the caption following the image
CAMEL Web Application User Interface for radiation belt effects study. With the domain and validation study preselected the interface offers choices of events, models, and observation locations in the first row. The second row of controls feature parameter, skill score (drop down activated to show choices), interpolation method, and time tolerance (maximum gap between nearest model output and observation time).

The interpolation method and gap filtering based on a user-selected time interval are new features not available with the earlier “Metrics-Vis” plotting interface at CCMC. Skill scores are listed in one table for each location and across all selected locations or in another table for each time period (event) and across all selected events. Graphs are shown for each location (station, satellite) selected for one of the time periods or events. The data is graphed dynamically with panning, point picking (shown in Figure 4), and zooming capabilities.

Details are in the caption following the image
Comprehensive Assessment of Models and Events Using Library Tools plots of fluxes observed and modeled at the RBSP-A and RBSP-B locations. Point picking is showing a value along one of the model tracks. RBSP = Radiation Belt Storm Probes; HOPE = Helium Oxygen Proton Electron.

As the tool chest of the CAMEL library grows, more granular functionality will be added to the user interface. These include rolling skill scores (based on interactive selection of start and stop times), scoring across multiple bases (observations from similar instruments, if available), smoothing options, and dynamic setting of threshold values to calculate skills based on hits and misses. Chaining of operations will allow for a flexible and powerful system that can be tailored to specific metrics. Each validation study uses a default analysis chain but CAMEL will allow for advanced tweaking of the skill score selection and of how some calculations are performed.

3.3 Skill Scores

The skill scores that apply to most real model output values yi and observation data xi at times i (model data interpolated to observation times) have been implemented in the old metrics validation system and are repeated in the new interactive visualization (with urn:x-wiley:swe:media:swe20870:swe20870-math-0001 being the arithmetic mean and urn:x-wiley:swe:media:swe20870:swe20870-math-0002 the standard deviation of a times series x):
  • Root-Mean-Square Error (RMSE): urn:x-wiley:swe:media:swe20870:swe20870-math-0003,
  • Correlation Coefficient (CC): urn:x-wiley:swe:media:swe20870:swe20870-math-0004,
  • Prediction Efficiency (PE): urn:x-wiley:swe:media:swe20870:swe20870-math-0005.
  • Mean Absolute Error (MAE): MAE=<|yixi|>
  • Median Absolute Error (MdAE): MdAE = M(|yx|) with M(e) denoting the median value of a distribution e.

Skills to be implemented include the Mean Absolute Percentage Error (MAPE): urn:x-wiley:swe:media:swe20870:swe20870-math-0006 (see, e.g., Morley et al., 2018).

3.4 Examples of SPASE Model Run Description

Having proper metadata for all data sets is an important first step for model-data comparison. Metadata contain searchable details of the model version and configuration used that may include descriptions of component models in a coupled simulation, input parameters and numerical settings used in the simulation. Within the CAMEL framework, all metadata stored is SPASE compliant. An example of such is provided in XML format in Figure 5 for the WSA model version 2.2.

Details are in the caption following the image
Wang-Sheely-Arge (WSA) v.2.2 model metadata in Space Physics Archive Search and Extract format.

In Figure 6 below, it shows the metadata for the predicted plus 1-day solar wind parameters generated by the near-real-time WSA version 2.2 model run. From the metadata of the output, one can see that it is generated by the parent resource which in this case is the WSA v.2.2 model. With the linkage information in the metadata model, it is easy for anyone to see what the output is and which model and model run generated such output. This information is crucial for others to verify and be able to reproduce the same results.

Details are in the caption following the image
Wang-Sheely-Arge (WSA) v.2.2 model output for the predicted +1-day solar wind parameters.

4 Results

4.1 Ground Magnetic Perturbations at 13 Magnetometer Stations

Three global magnetosphere models and two statistical models were run to specify magnetic perturbations on the ground that are caused by changes in the magnetosphere and ionosphere of the Earth (Pulkkinen et al., 2011, 2013; Rastätter et al., 2014a). Solar wind data for six selected events lasting from 1 to 2 days each were selected and observations at 13 magnetometer stations in “chains” (listed north to south) on the American West Coast (Yellowknife [YKC], Meanook [MEA], Newport [NEW], and Fresno [FRN]), East Coast (Iqaluit [IQA], Poste-de-la-Baleine [PBQ] or Sanikiluaq [SNK; for later events], Ottawa [OTT], and Fredericksburg [FRD]), and Europe (Hornsund [HRN], Abisko [ABK], Wingst [WNG], and Furstenfeldbruck [FUR]) were compared with results obtained from the models. Magnetic perturbation data (ΔBNorthBNorth, ΔBDown) are compared to the observations for stations poleward of the auroral zone (YKC, IQA, and HRN), near the auroral zone (high latitude; MEA, PBQ/SNK, and ABK), subauroral latitudes (midlatitudes, ∼55°; NEW, OTT, and WNG), and at low latitudes (<50°; FRN, FRD, and FUR).

The CAMEL tool allows the user to select subsets of stations and events to perform ranking of models and calculate averages of skill scores over the set of stations per event and over all events. Skill scores selected from a range of options are tabulated, and rankings visualized. For reference each model-data comparison plot for each event contains the traces (observation and model traces) for each magnetometer in a subpanel in the plot for each event. In the following we present results from CAMEL applications for selected validation campaigns. The first is the GEM2008 Ground magnetic perturbation study. Figure 7 shows a sample analysis on ΔBNorth performed with CAMEL showing the skill ranking of the different models for all stations. To refine the analysis the user can select one of the available quantities (ΔBNorth, ΔBEast, and ΔBDown) and any subset of the 8 events and the 13 magnetometer stations. Figure 8 shows scores for auroral-latitude stations (ABK, PBQ, and MEA) for Event 1. Skill scores are calculated for each selected model run, event, and station, then tabulated for all selected stations and each station individually (across all selected events) and for all events and each event (for the set of selected stations). This allows the user to evaluate how each model performs for the different events and also by selecting different subsets of the stations, where models perform better or worse spatially.

Details are in the caption following the image
Cross-Correlation scores for ΔBNorth at all magnetometers for all events for the five simulation runs 9_SWMF, 4_OpenGGCM, 2_LFM-MIX, 2_WEIGEL, and 6_WEIMER. Skill scores are tabulated for each model by station (averaged over all selected events) and by event (averaged over all selected stations).
Details are in the caption following the image
Prediction Efficiency scores for ΔBNorth at auroral-zone magnetometers (ABK, PBQ, and MEA) for the Event 1 for five simulation runs 9_SWMF, 4_OpenGGCM, 2_LFM-MIX, 2_WEIGEL, and 6_WEIMER. This mode of operation can generate scores by region for multiple events similar to Figure 7.

The graphing capability is shown in Figure 9 for station IQA during Event 1. The CAMEL tool allows the user to show model and data comparison plots for one event at a time at multiple station locations (in separate plot panels, not shown).

Details are in the caption following the image
Time series plots of ΔBNorth at Iqaluit (IQA) for Event 1 for the five simulation runs. A separate interactive graph is created for each station selected (two graphs are shown). The user can query numerical values by mouse-over and select a portion of the graph for detailed display (not shown).

In the near future we plan to implement bar graphs to visualize the skill scores listed in the tables. Later in the development of the tool we envision the option to evaluate models using two skill scores. Visualization would render the models' performance in a two-dimensional (Taylor-type) plots.

Once we import time derivatives of the horizontal component of the magnetic perturbations (dΔBH/dt; Pulkkinen et al., 2013), we will add a user interface to select threshold values with a list of the values that were used in the published study to generate a contingency table of Hits, Misses, False Predictions, and Correct Negatives. We will calculate skills such as POD, POFD, or HSS.

4.2 WSA Model and In Situ Observation of Density and Velocity at L1 in the Heliosphere

The WSA (Arge & Pizzo, 2000) near-real-time “simulation” is a continuous series of coronal and heliospheric solutions that are constructed by the of the WSA model based on magnetograms of the Sun that are updated every 6 hours. Each WSA calculation in that series provides forecasts with lead times between 1 and 7 days (in daily increments) of solar wind speed and density at the first Lagrange point near Earth (about 240 Earth Radii or about 1.6 × 106 km closer to the Sun from Earth on the Sun-Earth line). Metadata stored for such a run can be found in the CMR. Solar wind monitoring satellites such as IMP8, WIND (Lepping et al., 1995; Lin et al., 1995), ACE, and DSCOVR (McComas et al., 1998; Smith et al., 1998) reside in circular orbits around L1 and their data are aggregated into the OMNI2 database. Model outputs are compared to OMNI2 hourly averages of velocity and density.

The CAMEL visualization tool interpolates model outputs to the times of the observation data points. The user can select between linear interpolation and nearest neighbor interpolation. Gaps in model data are eliminated by selecting any maximum time between model outputs that should be used to perform interpolations. To find matching model outputs, the nearest neighbor is sought among the model outputs for each observation data time. A matching pair is created only if the difference between the observation time and the nearest model output time is at or below the user-selected time threshold. Model outputs are then interpolated using the selected method (e.g., nearest neighbor interpolation, linear interpolation) to create the modeled value in the matched pair. All created pairs are then used in skill score calculations. The percentage of available samples is the ratio between the number of matched pairs and the number of observation times for a given time interval (event). In the near future we plan to add the capability of listing the percentage of matched samples that were used in each skill score calculation. Time intervals indicating Carrington Rotations can be presented for the users to use as events.

Figure 10 shows the 1-day and 4-day predictions from WSA compared to the 1-hr OMNI data. This month's skill scores indicate average agreement between observation and models (see Table 1).

Details are in the caption following the image
Wang-Sheeley-Arge 1-day (orange: original data, red: linear interpolated) and 4-day predictions (blue, purple) and ONMI solar wind speed (green) for January 2017. CCMC = Community Coordinated Modeling Center.
Table 1. Monthly Skill Scores for 2016 (June–December) and 2017 (January–June) From the WSA Model's Solar Wind Speed Predictions With 1 and 4 Days of Lead Time
Run Skill July Aug. Sep. Oct. Nov. Dec. Jan. Feb. Mar. Apr. May Junea Year
1 dayb CC 0.293 0.260 0.691 0.700 0.622 0.490 0.511 0.707 0.406 0.282 0.737 0.557 0.519
4 day CC 0.117 0.125 0.744 0.778 0.523 0.482 0.646 0.740 0.427 0.059 0.648 0.642 0.498
1 day PE −0.159 −0.211 0.457 0.428 0.230 0.171 0.077 0.417 0.025 −0.193 0.539 0.117 0.191
4 day PE 0.511 −0.418 0.507 0.575 −0.0903 0.123 0.319 0.445 0.116 −0.805 0.393 0.321 0.133
  • Note. CC = Correlation Coefficient; PE = Prediction Efficiency; WSA = Wang-Sheeley-Arge.
  • a Comparisons with only 30% model coverage.
  • b Runs labeled by number of days of lead time rather than the Space Physics Archive Search and Extract ID.

4.3 Radiation Belt Dynamics

As part of an ongoing assessment of model skill specifying electron flux in the inner magnetosphere, several ring current and radiation belt models are being run for selected time periods (including 17 March 2013 and 31 May to 1 June 2013). Modeled energy distributions of electron fluxes are interpolated to energies measured by the HOPE instrument on the two RBSP (or Van Allen Probes) satellites (RBSP-A and RBSP-B) that orbit between 2- and 7-RE distance around the Earth. HOPE energies range from a few electron volts to 40 keV and electron fluxes at energies above 10 keV are correlated with surface charging events in Low-Earth Orbiting satellites (Yu et al., 2019). The online visualization and analysis tool on the CCMC website plots electron fluxes on a logarithmic scale and applies differences of logarithms of flux values to calculate skill scores. This ensures that skills are calculated based on ratio of observation and model values instead of linear differences, which are meaningless in an environment that is characterized by differences spanning orders of magnitude (up to and beyond 106; ; Friedel et al., 2002; Selesnick & Blake, 1997). Morley et al. (2018) introduced the Median Symmetric Accuracy and the SSPB, improvements over the Mean Absolute Percentage Error scores that are based on log(flux) values.

The skill scores in Table 2 were generated with the old run_metrics_vis.cgi interface on the CCMC website (with study = Plasma_Rad_Effects obs=RBSPA-HOPE or RBSPB-HOPE, event = 1 or 2, and metrics=2 to compare integrated fluxes over 10 keV [metrics = 1 compares integrated flux over all HOPE energies], using https://ccmc.gsfc.nasa.gov/cgi-bin/run_metrics_vis.cgi?study=Plasma_Rad_Effects&event=2&metrics=2&obs=RBSPB-HOPE). PE and CC scores are available in CAMEL but SSPB scores have not yet been added.

Table 2. Skill Scores for the Two Events (16 March 2015 00:00 to 20 March 2013 00:00 UT and 31 May 2013 00:00 UT to 2 June 2013 00:00 UT) in the Modeling of Integrated Fluxes Measured by RBSP (VAP) HOPE Instrument for Energies Above 10 keV
Runa SPCR Event 1b Event 2c Event 1 Event 2 Event 1 Event 2
1_CIMI RBSP-A 0.683 0.252 −1.133 −13.896 90.4 38.7
1_RAMSCB RBSP-A 0.530 −0.393 −3.6
1_SWMF-RAMSCB RBSP-A 0.292 0.395 −2.303 −1.038 −162.8 −121.6
2_SWMF-RAMSCB RBSP-A 0.212 −2.946 −179.3
1_CIMI RBSP-B 0.699 −2.201 145.7
1_RAMSCB RBSP-B 0.285 −11.242 −156.9
1_SWMF-RAMSCB RBSP-B 0.453 0.415 −2.470 −0.685 −137.5 −32.6
2_SWMF-RAMSCB RBSP-B 0.326 −3.110 −156.8
  • Note. Skills were calculated using logarithmic fluxes when model and observation values were ≥106 /s/sr/cm2. CC = Correlation Coefficient; PE = Prediction Efficiency; RBSP = Radiation Belt Storm Probes; VAP = Van Allen Probes; HOPE = Helium Oxygen Proton Electron.
  • a Runs as labeled in the old visualization.
  • b 17 March 2013 00:00 to 18 March 2013 00:00 UT.
  • c 31 May 2013 00:00 to 2 June 2013 00:00 UT.

Figure 11 shows the electron fluxes over all energies observed during the day of 17 March 2013 at RBSP-A (purple) and four different model runs (1_SWMF-RAMSCB [green], 2_SWMF-RAMSCB [same model with slight adjustments, red], 1_CIMI [blue], and 1_RAMSCB [stand-alone RAMSCB, orange]). We implemented in Metrics-Vis the skill score urn:x-wiley:swe:media:swe20870:swe20870-math-0007 with M(z) denoting the median value of the argument z (array of individual time step values) as described in Morley et al. (2018). This skill score penalizes overestimation by a given factor by the same weight as underestimation. It is insensitive to outlying values as it uses the median value while providing an intuitive percentage value. SSPB is well-suited for observations where values vary over a large dynamic range as found in the Earth's radiation belt. In CAMEL we plan to offer SSPB and the related score urn:x-wiley:swe:media:swe20870:swe20870-math-0008, both introduced by Morley et al. (2018), for studies of radiation belt and ring current fluxes.

Details are in the caption following the image
RBSP-A HOPE integrated electron flux over all energies (purple) and model results (other colors) for the day of 17 March 2013 (part of Event 1). RBSP = Radiation Belt Storm Probes; HOPE = Helium Oxygen Proton Electron.

5 Summary and Outlook

We demonstrated essential new capabilities of the CAMEL framework to support comprehensive model-data comparison studies. CAMEL is capable of calculating skill scores across multiple events (predefined time periods) and across multiple station or spacecraft locations to generate aggregate skill scores at each location for all selected events and at all locations for each event. Plots of data and model results can be created for multiple locations, if applicable, for one of the selected time periods. In addition, CAMEL allows for a selection of interpolation methods and how long of a gap between observation data and nearest model output in time to accept when performing interpolations. All capabilities enable the user to perform validation studies beyond the reach of the previously existing Metrics-Vis interface available at the CCMC website that was restricted to a single event and a single observation location.

The CAMEL framework to compare observation data with model results is still undergoing rapid development. We plan to have a single CAMEL interface that can feature all available validation studies by physical domain and type of parameter to cover observations and model outputs in the Solar Corona, Heliosphere, Magnetosphere, Ionosphere/Thermosphere and on the Earth's surface. We are working to add more skill scores and interpolation methods and to provide a user interface to apply minimum and maximum values when making plots and calculating skills. We envision the CAMEL framework to extend into multi-dimensional data-model comparisons by employing pattern and feature extraction tools. The capabilities of MET at NCAR that is used for National Oceanic and Atmospheric Administration weather modeling are being investigated for possible inclusion. Later, pattern recognition software within MET that weather forecasters use to track two-dimensional features such as clouds or storm fronts in time will be used in model-model comparisons or instances where imaging is available on a regular basis such as in CME modeling in the heliosphere where comparisons with Coronagraph images or heliospheric Jmaps (space-time plots tracking features moving along a certain line in the plane of view) can be performed. Pattern recognition and tracking will also be useful in comparison of ionosphere and auroral observations and modeling.

The interactive visualization and skill scoring system will include categorical comparisons such as were done for the ground magnetic perturbation study where levels of dB/dt on a 1-min time scale from first-principles models of the global magnetosphere-ionosphere system and some statistical models of magnetic perturbation were compared to observations at selected magnetometers on the ground using predefined thresholds of dB/dt, a table of Hits, Misses, False Positives, and Correct No-forecasts were assembled and skills such as POD, of POFD, and HSS were calculated to rank the models. Together with event definitions the tool will allow the user to control threshold levels and subdivision of time intervals into sampling windows as was done for the previous work.

The coordination with VxOs via SPASE is out of scope of the CAMEL project. In the future a VxO interface to CCMC simulation runs will be mediated by the future RoR-NextGen system using SPASE descriptions of model runs and output parameters generated by each of the models. CAMEL's support of SPASE descriptors will eventually enable a direct pairing of observations with model results for automatic analyses.


This work is performed as part of research at the NASA Goddard Space Flight Center's Community Coordinated Modeling Center and is funded in part by the National Science Foundation (NSF) Space Weather Program. Data used in the model-data comparisons are the 1-hr OMNI2 data set for the solar wind (https://omniweb.gsfc.nasa.gov/) and ECT-HOPE data in the radiation belt from the Van Allen Probes (RBSP, https://rbsp-ect.newmexicoconsortium.org/data_pub/).