A Deep Learning Model for the Thermospheric Nitric Oxide Emission

Nitric oxide (NO) infrared radiation is an essential cooling source for the thermosphere, especially during and after geomagnetic storms. An accurate representation of the three‐dimension (3‐D) morphology of NO emission in models is critical for predicting the thermosphere state. Recently, the deep‐learning neural network has been widely used in space weather prediction and forecast. Given that the 3‐D image of NO emission from the Sounding of the Atmosphere using Broadband Emission Radiometry (SABER) onboard the Thermosphere Ionosphere Energetics and Dynamics satellite contains a large amount of missing data which is unobserved, a context loss function is applied to extract the features from the incomplete SABER NO emission images. A 3‐D NO emission model (referred to as NOE3D) that is based on the convolutional neural network with a context loss function is developed to estimate the 3‐D distribution of NO emission. NOE3D can effectively extract features from incomplete SABER 3‐D images. Additionally, NOE3D has excellent performance not only for the training datasets but also for the test datasets. The NO emission climate variations associated with solar activities have been well reproduced by NOE3D. The comparison results suggest that NOE3D has better capability in predicting the NO emission than the Thermosphere‐Ionosphere Electrodynamics General Circulation Model. More importantly, NOE3D is capable of providing the variations of NO emission during extremely disturbed times.

the advancement in computing hardware and open source framework for machine learning, the deep learning technique has become increasingly popular and has been successfully applied in space physics. Examples of deep learning application in space physics research include geomagnetic index forecasting (Gruet et al., 2018;Tan et al., 2018), electron density modeling in the inner magnetosphere (Chu et al., 2017), and solar UV/EUV image generation (Park et al., 2019). Deep learning (LeCun et al., 2015) can automatically extract valuable features from massive data, and provide an effective way to build the atmospheric parameter model from a large number of observations. The global three-dimensional (3-D) image of NO emission, constructed by SABER NO emission, contains a large number of missing data that are unobserved since there is only one orbital observation every ∼1.6 h. Such an incomplete 3-D image brings significant difficulties for modeling. The image completion has been successfully applied in space physics based on deep learning (Chen et al., 2019;Pan et al., 2020). However, there is no attempt to build a 3-D prediction model using incomplete 3-D images.
The NO emission 3-D images consist of the SABER observations over a selected time interval. These images contain a large number of unobserved regions within a time interval of a few hours, which is referred as missing data. The purpose of this study is to develop a 3-D NO emission model by using the SABER images with a large number of missing data. In this study, we used the 3-D convolutional neural network (CNN) to build the 3-D NO emission model. The CNN is a deep learning technique for image recognition and generation. The kernel in CNN is a matrix, which can scan the whole image and extract the translation and rotation invariance features from the image. In order to extract features from SABER 3-D images with massive unobserved regions, a context loss function was used to measure the similarity between the observations and CNN outputs. The 3-D NO emission model based on 3-D CNN with a context loss function is referred to as NOE3D hereafter. The accuracy of NOE3D is evaluated by using correlation and error analysis. NOE3D outputs during the April 5, 2010 geomagnetic storm are used to assess the capability of predicting the storm-time NO emission when the NO emission plays an important role in the thermospheric energy budget. Additionally, the results from NOE3D are compared with the Thermosphere-Ionosphere Electrodynamics General Circulation Model (TIEGCM) to validate whether NOE3D has a better performance than the theoretical model.

SABER Observations
The NO emission data at 5.3 μm are obtained from the SABER instrument onboard the Thermosphere-Ionosphere-Mesosphere Energetics and Dynamics (TIMED) satellite. It should be noted that the NO emission from SABER, TIEGCM and NOE3D are all at 5.3 μm in this study. The TIMED satellite is in a near-circle orbit at 625 km with an inclination of  74 and has an orbit precession of 60-days. The SABER observations cover from 83°S to 55°N (or from 55°S to 83°N) depending on the yaw cycle. The SBAER instrument has a spectral range from 1.27 μm to 17 μm covering the 5.3 μm NO radiation spectrum. The SABER data from 2002 to 2009 are used as the training set, and the testing set is from 2010 to 2015. The 3-D image of NO emission had a format of   15 36 36 grid representing the height (110-250 km) with a resolution of 10 km, latitude with a resolution of  5 and longitude with a resolution of  10 , respectively. The grid value in the 3-D observation image is the average of surrounding SABER observations. The time resolution of the images is 1.6 h. Figure shows a typical 3-D observation image in which colored points represent the observed data and transparent points represent unobserved regions. Although the image with a larger time resolution would provide a more extensive observation coverage, these observations at different longitudes have an almost fixed local time due to the slow orbital precession, which brings incorrect longitudinal features into the neural network model.

Loss Function
In this study, we use the 3-D CNN, a typical deep learning method, to develop a 3-D NO emission model. The 3-D CNN scans the whole image through a kernel which is a 3-D matrix and extracts the translation and rotation invariance features from the 3-D images. The variation of NO emission shows a time delay of ∼10 h behind the Joule heating variation, which has a good correlation with the geomagnetic activity index Kp . Meanwhile, the NO emission shows the best correlation coefficient with F10.7 when the NO emission has a time delay of 1 day behind the solar flux . Thus, NO emission depends on the history of Kp and F10.7 indices, and the 3-day period covers the time when NO emission is significantly correlated with Kp and F10.7. The inputs of the neural network are the 3-day Kp with a 3-h resolution, 3-day F10.7 with a resolution of 1 day, the universal time (UT) and day of year (doy), in the form of (   Although the training set has around 40,000 3-D images, around 95% of the area in each image is unobserved. The huge data gap makes the CNN training extremely difficult. In order to extract features from sparse 3-D images, we used a context loss function (Yeh et al., 2017), which can measure the context similarity between the outputs of neural networks and the valid observations. The context root-mean-square error (RMSE) loss function, in this study, is defined as below: where y is the 3-D image, ŷ is the output of the CNN and m is the number of observed grid points in the 3-D image, M is the binary mask of observed grid points with the size equal to the image, and  is the element-wise multiplication. It can be seen that the result of the context loss function depends only on the valid data of the observed images. The weights update of neural networks are determined by the loss function through a backward propagation algorithm in the form of where  i is the weight of neural networks and  is the studying rate. Thus, the context loss function can help the neural networks extract features from incomplete images. In this study, the 3-D NO emission model based on 3-D CNN with the context loss function is referred to as NOE3D. A typical output of NOE3D is shown in Figure 1b.

Model Performance
To evaluate the performance of NOE3D, we compared the NO flux at 5.3 μm from NOE3D, TIEGCM, and SABER. Figure 2 shows the comparison of NOE3D and TIEGCM results with SABER observations along TIMED trajectories from the training data set (years 2002-2009)   study, the NO flux is the integration of the NO emission rate at 5.3 μm from 110 to 250 km. It is seen that the NO Flux from the TIEGCM has a systematic error with respect to SABER observations. The slopes of the linear regression of log 10 (NO flux) between the TIEGCM and SABER are less than 0.68. Especially for low NO fluxes, the TIEGCM NO fluxes are much greater than the SABER observations. In contrast, the regression slopes of log 10 (NO flux) between NOE3D and SABER are larger than 0.91. The NO fluxes from NOE3D are consistent with the observations. The correlation coefficient is 0.931 for the training data set and 0.866 for the test data set, which represents that NOE3D reconstructs 86.7%      0.808 0.808 and 62.4%    0.790 0.790 variations, respectively, in the training and test datasets. Thus, NOE3D has a better performance in predicting the observed NO emission at 5.3 μm than the TIEGCM. Prior to building the NO emission model with CNN, the empirical orthogonal function (EOF) method has been used to reconstruct the global map of NO flux using SABER 5.3 μm emission data in previous studies Li et al., 2019). Flynn et al. (2018) presented a EOF method with spherical harmonics to extract features from the SABER observations. The EOFs can extract the most significant periodic variability at a fix space, such as the geomagnetic latitude and magnetic local time, as demonstrated in the study of Flynn et al. (2018). These EOFs are correlated with geomagnetic activity, solar activity and seasonal variations in the thermosphere, which show significant physical structures. On the contrary, the CNN is functioning more like a blackbox, but more flexible than the EOFs. In this study, the CNN can not only extract the longitude-latitude features, but also capture the NO emission variabilities along with local time and solar zenith angle through combining UT and doy from input parameters.
CHEN ET AL.  NOE3D can predict 3-D NO emission rates at 5.3 μm in a height range from 110 to 250 km. The TIEGCM and SABER NO emissions at 5.3 μm are interpolated to the same heights as NOE3D grids. The correlation coefficients at various altitudes are presented in Figure 3. NOE3D has an excellent performance from 180 to 210 km where the coefficients for both the training and test datasets were greater than 0.88. However, the correlation coefficients become lower below 120 km and above 250 km. The NO emission is derived from the measured radiance profile by the SABER instrument through Abel inversion (Mlynczak et al., 2005). The measurement error, the rotational temperature uncertainties and the improper filter function remain unknown noises in the NO emission (Mlynczak, Hunt, Thomas Marshall, et al., 2010). The effect of noises becomes significant and may contribute to the low correlation between SABER measurements and model results below 120 km and above 250 km where the NO emission declines greatly. Compared with the TIEGCM results, NOE3D has a better performance at all altitudes. This suggests that the deep neural network is a powerful technique for reconstructing the global NO emission rate based on limited SABER NO observations. The 14-year SABER observations (2002)(2003)(2004)(2005)(2006)(2007)(2008)(2009)(2010)(2011)(2012)(2013)(2014)(2015) cover the declining phase of solar cycle (SC) 23 and the ascending phase of SC24. The performance of NOE3D can be examined under different solar activity. Figure 4 shows the time series of daily average NO flux from SABER, NOE3D, and TIEG-CM. The relative deviation is defined as  model SABER / SABER. The NO flux shows a great coincidence with solar activities (Mlynczak et al., 2014(Mlynczak et al., , 2015.   datasets. NOE3D generally reproduced the observed NO flux variations in different solar activity conditions. The high frequency "spikes" were also captured by NOE3D, which attributed to the geomagnetic activity (Mlynczak et al., 2014), such as the 2003 Halloween storms. There are large differences between TIEGCM and SABER. The average relative deviation of TIEGCM is up to 102.8%. Although TIEGCM captures the trend of NO flux variations as results of solar activity or geomagnetic activity, the NO flux in TIEGCM is generally overestimated.
NOE3D not only performs well on the training datasets, but also predicts the NO emission features in the test datasets. In the next section, we will focus on the results in the test data set to estimate the prediction capability of NOE3D. The error distributions of NO emission rate with altitudes from 2010 to 2015 are given in Figure 5. The error is defined as      10 10 log models log SABER . As shown in Figure 5, the mean error of NOE3D is close to zero at all altitudes and the mean RMSE is 0.19, which implies that NOE3D can accurately reconstruct the height distribution of the NO emission rate. The mean RMSE from the TIEGCM simulations reaches 0.35, which indicates that the TIEGCM NO emission rates have large discrepancies with respect to the SABER observations. From 110 to 160 km, the TIEGCM NO emission rates are larger than the SABER observations, but lower at higher altitudes. These discrepancies at the NO emission peak height (∼130 km) are consistent with the conclusion of Li et al. (2018), who suggested that the larger NO emission in peak region may be caused by temperature over-estimation in the TIEGCM. Figure 6 shows the error distribution of NO emission at 130 km as a function of latitude and longitude from 2010 to 2015. The error calculation in Figure 6 is the same as that in Figure 5. The height distribution of NO emission rates from NOE3D is also consistent with the SABER observations. The mean error of NOE3D is 0.013 indicating that the NO fluxes from NOE3D are slightly larger than SABER observations. The NO emissions from the TIEGCM have large discrepancies along both the latitude and longitude. The errors of TIEGCM NO emission are mostly distributed in the positive region and is far from zero. Thus, the TIEGCM results are significantly greater than SABER observations. The TIEGCM NO emission has a large error at low and middle latitudes, but it decreases at high latitudes. The average error of NOE3D is much smaller than that in the TIEGCM, and the NO fluxes of NOE3D are more consistent with the SABER observations. In summary, compared with the TIEGCM, NOE3D performs better.
CHEN ET AL.

A Storm Event
The capability of NOE3D in reproducing the observed NO emission is also examined during the April 5, 2010 geomagnetic storm. Note that SABER observations during this storm were not used in training NOE3D model. The geomagnetic index Dst during this storm is shown in the top panel of Figure 7. This storm was triggered by an interplanetary coronal mass ejection at 08:27 UT (Lu et al., 2014) and caused the strongest disturbance at 14:00 UT on April 6, 2010 with a minimum Dst of −81 nT. The NO emission is sensitive to particle precipitation and neutral temperature that are relevant to the NO production and activation (Bailey et al., 2002;Saetre et al., 2004). During the storm time, the disturbed magnetosphere and solar wind enhanced particle precipitation and Joule heating and thus caused the NO emission to increase by multiple factors. NO emission has a quick response to particle precipitation and neutral temperature changes, and balances with most of the energy inputs (Chen & Lei, 2018). Thus, the storm event is an appropriate condition for examining the capability of NOE3D in space weather events.
The evolution of NO flux along satellite descending orbits is shown in Figure 7. Similar to the SABER observations, the NO fluxes from NOE3D model increased significantly during storms. This feature is also captured by the TIEGCM. The prominent enhancements of SABER NO flux extend to low latitudes, while the storm-time enhancements of NO fluxes predicted by NOE3D and TIEGCM only extended to the middle latitudes. Although NOE3D model does not reconstruct completed features seen in SABER data, the NO fluxes from NOE3D model are closer to SABER observations in morphology than those from the TIEGCM. The height-dependent structure of the NO emission rate plays an important role in thermospheric evolution during the geomagnetic storm. In order to evaluate the storm-time height distribution of NO emission rate, the evolutions of the average NO emission along the descending orbit around 6:00 LT at high geomagnetic latitudes      75 S 45 S and low geomagnetic latitudes      15 S 15 N are shown in Figure 8. At low geomagnetic latitudes (left panel of Figure 8), the low-altitude (110-150 km) average NO emission rate obtained from the TIEGCM is much larger than SABER observations, while the high-latitude (210-250 km) average NO emission of TIEGCM is smaller than SABER. These discrepancies are consistent with the statistical error distributions as a function of height, as presented in the previous section. As shown in Figure 8d, the TIEGCM NO flux is about 1.5 times greater than that from SABER measurements, although the variation trend of the TIEGCM NO flux over time is similar to the SABER observed NO flux. Compared with the TIEGCM, NOE3D predictions are more consistent with the SABER observations at all altitudes. The RMSE of NO emission in NOE3D is just 0.035 × 10 −8 W m −3 which is much lower than     8 3 0.159 10 W m of the TIEGCM RMSE. At high geomagnetic latitudes, the magnitudes of TIEGCM and NOE3D outputs were comparable to SABER observations. The TIEGCM NO flux shows a similar variation at the onset of the geomagnetic storm, whereas NOE3D has a better performance during the recovery phase. At higher altitudes (>150 km), the peak NO emissions of TIEGCM are smaller than the measured peak NO emissions and lag the measurements by ∼7 h. Since the NO emission at high geomagnetic latitudes are sensitive to Joule heating and particle precipitation, the discrepancies between TIEGCM and SBAER may be associated with the specification of the polar energy injection and auroral precipitation in the model. Different from TIEGCM, NOE3D captures the storm-time peak value of the measured NO emissions, but NOE3D results precede the measurements by ∼6 h. The largest deviation between NOE3D and SABER occurs in the main phase of the storm when thermospheric composition, temperature and winds are dramatically disturbed. At 21:00 LT CHEN ET AL.  which is the local time of the ascending orbit, NOE3D also capture the main variability of measurements (not shown). The RMSEs of NOE3D NO emissions are less than the half RMSEs of the TIEGCM both at high and low geomagnetic latitudes along the ascending orbit. At low geomagnetic latitudes, the discrepancies of TIEGCM at 21:00 LT are similar to those at 6:00 LT. Overall, there is a great difference between TIEGCM and observations at low geomagnetic latitudes, while NOE3D performs well at both high and low geomagnetic latitudes. CHEN ET AL.

Discussion and Summary
We used a context loss function into 3-D CNN to extract the features of sparse SABER 3-D images. Using this method, NOE3D model reconstructed the 3-D global map of NO emission at 5.3 μm. Correlation analysis shows that NOE3D model has an excellent performance not only for the training data set but also for the test data set. Compared with the TIEGCM, the NO emission from NOE3D model was closer to the SABER observations. The error analysis shows that the TIEGCM has an obvious systematical deviation that the NO flux from the TIEGCM is generally larger than the SABER observations. The height distribution of NO emission error indicates that the TIEGCM prediction is larger in the peak region of NO emission and smaller in the high-altitude regions. The error distribution of NOE3D model shows that the NO emission predicted by NOE3D is close to the observation at all altitudes. It is interesting to note that the correlation between NOE3D and observations at high altitudes becomes significantly weaker, but the error does not increase noticeably, and the average error is still close to zero. As discussed above, there are huge measurement noises of SABER observations at high latitudes. It is difficult for the CNN to effectively extract features from the observations with huge measurement noises. However, deep learning has a strong fitting ability. It can still easily learn the statistical characteristics of the observed data even when the SABER NO emission images have a lot of noises, so that NOE3D has no obvious system deviation.
Due to the uncertainty in chemical reaction coefficients or other reasons, there is an obvious systematic deviation in TIEGCM NO emission prediction. The NO emission is an effective thermostat to the thermosphere during and after storm times. Accurately predicting the NO emission rate during storms is critical to study the evolution of the thermosphere. Our study shows that the storm-time variation trend of NO flux simulated by TIEGCM is similar to that observed by SABER at low geomagnetic latitudes, but its values are much greater than SABER observations by a factor of 1.5. TIEGCM NO flux has better performance at high geomagnetic latitudes than low latitudes. However, there are still deviations between simulations and observations at high altitudes (>150 km). Li et al. (2018) compared the TIEGCM temperatures with two empirical thermospheric models, and showed that the TIEGCM temperature is smaller at lower altitudes near 110-135 km and larger at high altitudes compared to the empirical MSIS model. These potential biases in TIEGCM temperature contribute to the NO emission overestimation at low altitudes and underestimation at high altitudes, as shown in the left panel of Figure 8. At high latitudes, the NO emissions are controlled by the polar energy deposition and auroral precipitation. During the storm time, the morphologies of TIEGCM NO flux enhancement are different from SABER measurements, which may be caused by three reasons: (1) the TIEGCM high latitude convection pattern is specified using the empirical Heelis model, which may underestimate the convection electric fields and predict a high latitude convection pattern that is different from the real pattern (Wu et al., 2014); (2) the auroral precipitation pattern in the TIEGCM may also be off from the true auroral precipitation. A numerical study by Sheng et al. (2017) demonstrated that the characteristic energy of precipitating particles can modulate the height distributions of storm-time NO emissions, while the precipitating number flux influences the production of NO; and (3) large uncertainty in NO related chemical reaction rates (Chen & Lei, 2018;Sheng et al., 2017). Thus, it's difficult to accurately simulate the NO emission rate when there are uncertainties in the chemical reaction rates, polar energy inputs and auroral precipitation. NOE3D has significant advantages over the TIEGCM, and its prediction is consistent with SABER observations. However, NOE3D model has a shortcoming in the prediction of storm-time latitudinal variations. As shown in Figure 7, the NO emission enhancement observed by SABER during the geomagnetic storm extended from high latitudes to low latitudes. However, the NO emission enhancements in NOE3D were limited to above  40 latitude. The storm-induced equatorward neutral winds and disturbances transport the polar energy to middle-low latitudes and then heat the atmosphere (Barth et al., 2009;Richards, 2004). The temperature increases contribute to the equatorward extension of NO emission enhancement due to the temperature-sensitive reaction of NO production (Bailey et al., 2002). NOE3D failed to capture the equatorward extension possibly due to lack of physical processes. Different form NOE3D, the TIEGCM can capture physical process for understanding the storm-time thermosphere variations. Combining the NOE3D and TIEGCM may be an effective way to improve the storm-time performance of NOE3D, and will be carried out in future works. On the other hand, it is difficult for a single low Earth orbit satellite to fully capture the NO emission variations during storm times. NOE3D may not capture all the characteristics of storm-time NO emission from incomplete observations. In summary, NOE3D has a powerful ability for NO emission prediction during geomagnetic storms, which basically reproduces the NO emission variations, but the predictive ability of NOE3D needs to be further improved. More observation data should be used to further optimizing NOE3D.
In space physics, there are several ways to observe the space environment, but it is still difficult to obtain complete images covering the entire earth. Over the 2 decades of SABER observations, there are a huge number of incomplete images, some of which are even sparse. In this study, we focus on extracting valid features from sparse images. We have used the CNN with a context loss function to train NOE3D. The error analysis and storm event study have demonstrated that NOE3D model has a better performance in 3-D distribution than the physics-based TIEGCM and a better prediction capability during geomagnetic storms. NOE3D also reproduces the climate variations of NO emission associated with solar and geomagnetic activities.

Data Availability Statement
The SABER data were downloaded from http://saber.gats-inc.com.