Evaluation of the Antarctic Boundary Layer Thermodynamic Structure in MERRA2 Using Dropsonde Observations from the Concordiasi Campaign

Recent high‐resolution dropsonde observations from the 2010 Concordiasi field campaign in austral spring season show that surface‐based inversions (SBIs) over Antarctica are frequently eroded, with well‐mixed boundary layers occurring 33% and 18% of the time in West and East Antarctica, respectively. In this study, using the dropsonde observations, we evaluate the performance of the Modern‐Era Retrospective analysis for Research and Applications, version 2 (MERRA2) in representing the Antarctic boundary layer thermodynamic structure. Results show that MERRA2 has a good overall representation of the Antarctic surface stability and correctly predicts 82% of the SBIs. However, an underprediction of less stable boundary layer occurrence, especially over the elevated East Antarctic plateau, is favored during conditions of increased lower tropospheric stability associated with model dynamics, indicating difficulty in parameterizing turbulence in very stable boundary layers. In addition, a lower tropospheric cool bias (first model level and above) is observed in the MERRA2 reanalysis, especially over West Antarctica, which amplifies in the boundary layer during mixed conditions. The near‐surface cold bias is most pronounced when the model fails to predict mixed layers over West Antarctica and is expected to negatively impact the representation of surface energy budget and melt processes.


Introduction
Antarctica has been characterized by a quasi-permanent surface-based inversion (SBI) due to intense, persistent radiative cooling over the ice sheet that can lead to stratification of up to 25 K in the lowest 10 m (e.g., Vignon et al., 2017). Simulating turbulence in the stable boundary layer remains a challenge for numerical weather prediction and climate models (Holtslag et al., 2013;Mahrt, 2014). Observations over Dome C, Antarctica, show a clear regime separation in turbulence characteristics for very stable boundary layers compared to that of weakly stratified and/or mixed boundary layers, with the latter simulated fairly accurately based on similarity theory (Vignon et al., 2017). On the other hand, the governing physics and simulation of turbulence in the former regime is difficult owing to the nonstationarity of the mean flow and difficulty in parameterizing the weak, intermittent coupling between the surface and turbulent layers (Mahrt, 2014;Vignon et al., 2017). Several studies have investigated the temporal evolution of the boundary layer in Antarctica to gain a better perspective of its turbulence characteristics (Argentini et al., 2005;Gallée et al., 2015;King et al., 2006;Mastrantonio et al., 1999). As the scientific community continues to advance the parameterization of the stable boundary layer (Duynkerke, 1999;Mahrt, 2014;Van de Wiel et al., 2002), it has become important to provide verification methods to evaluate the performance of these parameterization schemes in global and regional models. Among other fields, the surface stability is an important parameter that is influenced by turbulence parameterization in the model. In the past, studies evaluating model boundary layer performance over Antarctica have been focusing on the representation of SBIs (e.g., Boylan et al., 2016;Zhang et al., 2011). However, Ganeshan and Yang (2018) showed that well-mixed boundary layers occur frequently over the continent. In this study, we perform regional evaluation of the Modern-Era Retrospective analysis for Research and Applications, version 2 (MERRA2) boundary layer thermodynamic structure over Antarctica using ground-truth from the austral spring 2010 Concordiasi field campaign, based on the classification method previously used for dropsonde profiles collected during this campaign (Ganeshan & Yang, 2018). We additionally investigate factors favoring the formation of less stable boundary layers in the model, and the importance of accurate representation of boundary layer thermodynamic structure for modeling the Antarctic surface climate.
The remainder of the paper is organized as follows: section 2 introduces the methodology and the data sets used for the study; section 3 shows the analysis of the representation of Antarctic boundary layer structure in MERRA2 as compared to the dropsonde observations; results will be summarized in section 4.

Data sets and Methodology
As mentioned earlier, the two data sets used in this study are (1) dropsonde observations from the austral spring 2010 Concordiasi campaign (UCAR/NCAR -Earth Observing Laboratory, 2011; Wang et al., 2013) and (2) the MERRA2 reanalysis atmospheric profiles (Gelaro et al., 2017) interpolated to the dropsonde locations. Note that the Concordiasi dropsonde data used here were not assimilated while producing MERRA2.
After quality control (following Ganeshan & Yang, 2018), the dropsonde data set contains a total of 313 atmospheric profiles over the entire Antarctic continent. To evaluate the performance of MERRA2 boundary layer thermodynamic structure, we adopt the classification scheme used by Ganeshan and Yang (2018) and categorize the soundings into surface-based inversions (SBIs), no SBIs, and mixed layers. The Ganeshan and Yang (2018) classification scheme is based on the temperature and potential temperature gradient in the lowest 100 m. In this classification method, SBIs are profiles with positive temperature and potential temperature gradients, noSBIs are profiles with negative temperature gradient but positive potential temperature gradient, and finally, mixed layers (MLs) are profiles with negative temperature and potential temperature gradient.
MERRA2 is the newest global atmospheric reanalysis product by NASA's Global Modeling and Assimilation Office, using the Goddard Earth Observing System (GEOS) model run at ½°latitude and 5/8°longitude resolution and 72 model levels, and the Gridpoint Statistical Interpolation (GSI) analysis scheme (Gelaro et al., 2017). One of the key strengths of MERRA2 is improved estimates of surface mass balance and surface temperatures over ice sheets (Cullather et al., 2014;Cullather & Bosilovich, 2012;Molod et al., 2015), which can be expected to contribute to an improved representation of the Antarctic surface stability.
The MERRA2 atmospheric profiles are available every 3 hours. Spatially, the comparison data set is generated by interpolating the profiles from their original grid points to the dropsonde locations. Temporally, data from the nearest time step are used. The MERRA2 assimilated fields (temperature, pressure, altitude, specific humidity, zonal, and meridional winds) are output at 72 model levels starting from roughly 50 m (first model level) to~80 km. In addition, we use the MERRA2 hourly instantaneous values of the 2 m temperature, winds, and specific humidity that are also collocated with the dropsonde profiles. In the GEOS model, 2 m fields are diagnostically computed based on interpolation between the surface and the first model level. Improvements to the estimates of surface characteristics over the ice sheet are therefore expected to be reflected in the 2 m fields of the model. Each MERRA2 profile thus contains 73 levels in total, and same as for the dropsonde data, the MERRA2 profiles are categorized into surface stability classes (SBIs, no SBIs, and mixed layers) following Ganeshan and Yang (2018), based on the temperature and potential temperature differences between the first model level (at~50 m) and the 2 m level. Unlike Ganeshan and Yang (2018), we do not subcategorize mixed layers as convective or neutral as most of them are neutrally stratified (median potential temperature gradient ≈−0.006°C/m).
The GEOS atmospheric model used in MERRA2 includes parameterization for atmospheric convection, large-scale precipitation and cloud cover, longwave and shortwave radiation, turbulence, gravity wave drag, a land surface model, and a prognostic representation of surface hydrology and albedo for glaciated land surfaces . In particular, the model employs the local Louis et al. (1982) scheme for parameterizing turbulence in the stable boundary layer which estimates the heat and momentum diffusivity coefficients based on the turbulent length scale and Richardson number at each time step but also using the planetary boundary layer (PBL) depth from the previous time step to determine the turbulent length scale. In MERRA2, the PBL depth is defined as the model level where the eddy heat diffusivity coefficient (K H ) value falls below 2 m 2 s −1 threshold (McGrath-Spangler et al., 2015). The nonlocal Lock et al. (2000) scheme is used to parameterize turbulence in unstable boundary layers. In order to compare the relative contribution to the observed MERRA2 stability structure from different parameterized processes (radiation, moist physics, turbulence, and so on), we will investigate the model-generated temperature tendency profiles at corresponding locations and times. The temperature tendency is the average temperature change between the following and previous time step centered around the hour of assimilation and can provide important information regarding dominant processes affecting the boundary layer stability at a given time.
Prior to verification of surface stability against dropsonde observations, we first inspect the MERRA2 performance in the temperature, wind speed, and specific humidity profiles within the lowest 2.5 km.

Evaluation of MERRA2 Lower Tropospheric Fields Over Antarctica
Figure 1 compares MERRA2 temperature, specific humidity, and wind speed profiles within the lowest 2.5 km against the soundings obtained from dropsonde observations. Overall, the mean MERRA2 profiles agree well with the atmospheric soundings. The mean root-mean-square error (RMSE) of temperature and specific humidity is around 2.5°C and 0.12 g kg −1 , with both errors increasing going down towards the surface (Figures 1c and 1i). For MERRA2 wind speed profile, the mean RMSE is 3.2 m s −1 and the maximum values occur above the first model level (Figure 1f). While no uniform bias is observed for the wind speed profile (Figure 1e), the temperature errors are biased cold throughout the lowest 2.5 km ( Figure 1b) and moisture errors are biased dry in the lowest 1.5 km (Figure 1h). It is possible that the cold bias (Figure 1b), which is Profiles of mean (solid) and standard deviation (shaded) of (a) temperature, (d) wind speed, and (g) specific humidity for MERRA2 (red) and dropsonde (black); and the mean (solid) and standard deviation (shaded) of their differences shown as errors in (b), (e), and (h). The corresponding root-mean-square errors (RMSEs) are shown (black, solid line) in (c), (f), and (i) and the mean RMSE averaged from surface to 2.5 km is also shown (blue, dashed line). a well-known problem associated with predicting the nighttime or wintertime continental stable boundary layers (Beljaars & Viterbo, 1998;Sandu et al., 2013;Viterbo et al., 1999), is partly caused by the overestimation of surface radiative losses due to the dry bias in the lower troposphere ( Figure 1h). Irrespective of the cause for the cold bias, there exists a risk of a runaway cooling when encountering stable boundary layers because of the increased dampening of turbulent fluxes in a more stably stratified atmosphere (Sandu et al., 2013). Historically, a cold bias over nighttime and wintertime land regions has been rectified by modifying the turbulent closure of the Louis (1979) scheme to enhance diffusion under stable conditions which is an approach that is widely practiced in operational numerical weather prediction centers (Sandu et al., 2013) and is known to have detrimental effects on the stable boundary layer parameterization Cuxart et al., 2006;Svensson et al., 2011;Svensson & Holtslag, 2009). For MERRA2, in particular, the stability functions in the surface layer are replaced by the Helfand and Schubert (1995) scheme to increase turbulent heat exchange under stable conditions . The MERRA2 near-surface (2 m) temperatures, which are interpolated between the surface and the first model level based on similarity theory, appear to be particularly sensitive to this stability-based modification of the turbulence closure. Figure 2a shows the relation between observed and modeled temperatures at the 2 m level (left panel) and the first model level (right panel). Although the temperature RMSE is highest at the 2 m level, the most significant bias (−2°C) occurs at the fisrt model level which results from consistently lower model temperatures compared to observations (right panel of Figure 2a). The scatter is more pronounced at the 2 m level, which explains the higher RMSE (left panel of Figure 2a). At this level, MERRA2 temperatures are too cold when it is warm and too warm when it is cold, effectively reducing the mean bias to~−0.3°C. The nature of this reduced near-surface cool bias can be seen more clearly when the data are binned based on MERRA2 skin temperatures which acts as a proxy for stability regimes (Figure 2b). The 2 m temperature bias has a distinctive response and sign change from positive to negative when transitioning from warm (less stable) to cold (more stable) skin temperatures (regimes), suggesting that it is sensitive to stability-based modifications of the turbulence closure. The cold bias at the first model level, on the other hand, is more robust against variations in the model skin temperature.
In the next section, we will investigate the representation of MERRA2 boundary layers over the Antarctic continent, by comparing the occurrences of SBIs, noSBIs, and MLs against dropsonde observations.

Evaluation of Antarctic Surface Stability in MERRA2
As mentioned in section 2, based on the temperature and potential temperature gradient in the lowest 100 m, the dropsonde profiles are classified into SBIs (profiles with positive temperature and potential temperature gradients), noSBIs (profiles with negative temperature gradient but positive potential temperature gradient), and MLs (profiles with negative temperature and potential temperature gradient). Figure 3a   occurrence frequency. Even though MERRA2 underestimates (overestimates) the occurrence of MLs (noSBIs), it has a fairly good representation of the SBI frequency. Figure 4 shows the mean MERRA2 temperature, specific humidity, and wind speed profiles along with the errors (standard deviation of differences between MERRA2 and dropsondes) during SBIs, noSBIs, and MLs.  SBIs occur in a significantly colder, drier, and less windy environment accompanied by negative temperature and absolute humidity errors and positive wind speed errors between 200 to 1,000 m above the surface. The model biases are less pronounced at the 2 m level, with a positive bias observed for temperature at this level. As discussed in section 3.1, the negative correlation with skin temperatures and the compensating positive bias in MERRA2 2 m temperature errors specifically occur over the coldest surfaces (Figure 2b) that are evidently associated with the most stable boundary layers (SBIs; top left panel of Figure 4). In the case of noSBIs and mixed layers, the boundary layer is relatively less stable and the cold bias persists at the 2 m level (center left and bottom left panels of Figure 4). Figure 4 further shows that noSBIs and mixed layers occur in a much warmer and wetter environment. During noSBIs, there is a significant warm and wet bias in the upper levels (above 1,000 m). For mixed layers, the upper levels have relatively smaller errors without a significant bias. There is a cool bias below 1,000 m, though it is not as pronounced as SBIs. Moreover, there is a negative wind speed bias throughout the lowest 2.5 km for mixed layers, and similarly for noSBIs (except in the lowest levels below 500 m).
It has been shown that the destruction of Antarctic SBIs during the Concordiasi campaign is mainly caused by mechanical turbulence (Ganeshan & Yang, 2018). To examine the turbulence characteristics of SBIs, noSBIs, and MLs in MERRA2, we compare the model-generated mean profiles of eddy heat diffusivity coefficient (K H ) for the three types of boundary layers ( Figure 5). Note that most of the variability in the K H profiles stems from the Louis et al. (1982) scheme which is used to parameterize stable boundary layer turbulence in the model, whereas the Lock scheme remains largely inactive. There is indeed evidence of greatest turbulent activity below 500 m for MLs followed by noSBIs where K H often exceeds the threshold used to determine the model boundary layer height (2 m 2 s −1 ), indicating that the Louis et al. (1982) turbulence scheme contributes to the evolution from SBI (stable) to noSBI and/or mixed (less stable) boundary layer. For noSBIs, the K H values have a second elevated peak between 1,000 and 1,500 m indicating cloud-related turbulence, and corresponding to the positive moisture bias in Figure 4.
To quantify the performance of MERRA2 in representing the Antarctic boundary layer structure, we combine noSBIs and MLs into one category, namely, less stable boundary layers. The enhanced near-surface turbulence ( Figure 5), 2 m cold bias and generally warmer and wetter environment ( Figure 4) associated with noSBIs and MLs justifies the combination. Table 1 classifies the occurrence of less stable boundary layers (noSBIs and/or MLs) in MERRA2 as hits and false alarms, and the occurrence of SBIs (hereafter synonymous with stable boundary layers) as correct rejections and misses (Abdi, 2007). Table 1 shows that, overall, the probability of hits is 49% and that of correct rejections is 82%. The probability of misses (51%) is higher compared to the false alarm rate or probability of false detection (18%). Figure 6 indicates that the model has better ability to predict less stable boundary layers in West Antarctica (63% probability of hits) compared to East Antarctica (43% probability of hits), whereas the percentage of false alarms is slightly higher. Note that East Antarctica is considered to be the region eastward of 45°W up to 171°E (as in Ganeshan & Yang, 2018). In the next section, we will investigate the reasons for misrepresenting the occurrence of less stable boundary layers in MERRA2. Figure 7a shows the distribution of hits, misses, and false alarms over Antarctica. The highest density of misses (lowest density of hits and false

alarms) occurs over elevated regions of the East Antarctic Plateau.
Binning the data into quartiles of surface elevation (Figure 7b) further shows that the probability of hits and false alarms decreases (whereas that of misses increases) with increasing altitude. This suggests that the turbulence scheme underpredicts mixing in very stable boundary layers that occur over high-altitude regions dominated by strong surface radiative cooling. Figure 8 compares the MERRA2 mean eddy heat diffusivity coefficient (K H ) profiles for hits, misses, false alarms, and correct rejections. As expected, the mean K H profile associated with hits has a pronounced peak in the lower levels, with values exceeding the 2 m 2 s −1 threshold indicating that a well-defined boundary layer is often correctly predicted by the Louis et al. (1982) scheme. Similarly, during correct rejections, K H values in the lowest model levels rarely exceed 2 m 2 s −1 with mean values close to zero, thereby indicating the lack of boundary layer turbulent mixing. Compared to correct rejections, even though the K H values at the lowest model levels are higher for false alarms and misses, their mean values are lower than the 2 m 2 s −1 threshold, and there are no striking differences between the two categories. This suggests that the low-level negative temperature gradient associated with false alarms are not necessarily a result of enhanced turbulence in the model. Other factors such as moist physics, friction, and radiation can influence the low-level temperature structure and, in the case of false alarms and misses, consequently lead to the misrepresentation of MERRA2 Antarctic surface stability. Thus, in order to further explore the reason for the occurrence of misses and false alarms in the model, we investigate the temperature tendency at the lowest levels and the relative contribution from model physics and model dynamics.
MERRA2 temperature tendency is output at 42 standard pressure levels for every grid point, subsequent to vertical interpolation from the 72 model levels, thus the first pressure level can be well above the surface in some cases. As we are primarily interested in the contribution of model physics and dynamics to temperature tendency within the atmospheric boundary layer, we only consider profiles where the height of the first interpolated pressure level is within 100 m from the surface. This includes roughly one third of all profiles, and a similar distribution of hits (48%), misses (52%), false alarms (19%), and correct rejections (81%) is observed across this subsample. Figure 9a shows the fractional contribution to absolute temperature tendency at the lowest pressure level for each category resulting from radiation, turbulence, moist physics, friction, model dynamics, and analysis updates. For all categories, the largest contribution to absolute changes in low-level temperatures is from model dynamics followed by parameterized turbulence, analyses updates, and radiation. The temperature tendency due to analyses update, which is the information added using realtime observations to the model background, is similar across all categories. (It is used to derive the best estimate of temperature and other model fields and is therefore unlikely to contribute to misrepresentation of

10.1029/2019EA000890
Earth and Space Science the boundary layer temperature structure). The average contribution to temperature tendency from gravity wave dynamics is less than 1% (not shown).
In the case of hits, there is a more substantial contribution from friction and moist physics to the absolute temperature tendency (Figure 9a); however, their mean values are not significantly different compared to other categories (Figure 9b). For misses, although the mean low-level warming due to friction appears to be significant in Figure 9b, such an effect is only observed for a single profile (outlier) and is therefore not representative of all cases. In general, misses are accompanied by significant cooling due to model dynamics signaling the occurrence of cold air advection at the lowest model levels. This near-surface cooling (cold air advection) is even more remarkable in the context of upper-level warm air advection (not shown) which results in an overall decrease in the lapse-rate (Figure 9c). The lapse-rate tendency shown in Figure 9c is the difference in the temperature tendency due to model dynamics between the first and second pressure levels. The pronounced negative differences in the case of misses shows that the upper levels experience relatively stronger warm air advection compared to lower levels, which in turn increases the lower tropospheric stability, and evidently contributes to the misrepresentation of the boundary layer thermal structure. As discussed in section 1, there are significant challenges in parameterizing turbulence in very stable boundary layers (Mahrt, 2014;Vignon et al., 2017). While the artificial enhancement of turbulent diffusion for stable boundary layers may lead to an improved representation of large-scale atmospheric dynamics and model biases (Sandu et al., 2013), it has been shown to be detrimental for representing wind characteristics in layers of increased stability (Bosveld et al., 1999;Bosveld et al., 2008;Brown et al., 2005;Brown et al., 2008;Cuxart et al., 2006;Svensson & Holtslag, 2009).
For false alarms, on the other hand, there is a significant weakening of the mean radiative cooling at the lowest levels compared to misses and correct rejections (95% significance level). More than 30% of the false alarm cases have a net positive radiation contribution near the surface. (Note that a similar percentage is observed for hits but not for misses and correct rejections). For correct rejections, the mean radiative cooling and mean turbulent cooling are both significantly stronger (95% confidence level; Figure 9b).
In summary, Figures 9b and 9c suggest that at the lowest levels, false alarms are associated with weaker radiative cooling and misses are associated with cold air advection and increased surface stability. For hits alone, there is a strong contribution from moist physics, indicative of condensational heating and/or evaporative cooling tendency. The reason for the frequent underprediction of boundary layer mixing (high occurrence frequency of misses) appears to be related to deficiencies in parameterizing turbulence for very stable boundary layers (typically observed over high-altitude regions; Figure 7) and due to cold air advection at the lowest levels (relative to warm air advection aloft; Figures 9b and c) that evidently increases the lower tropospheric stability in the model.
In the next section, we will investigate the consequence of missing turbulent mixing, and the relation between the representation of surface stability and surface temperature in the model.

Relationship Between Surface Stability Representation and Near-Surface Temperature Errors
As discussed in section 3.2, the model has a higher probability of predicting less stable boundary layers (hits) over West compared to East Antarctica ( Figure 6). A higher fraction of cloud cover (Bromwich et al., 2012;Nicolas & Bromwich, 2011) and frequency of mixed boundary layers (Ganeshan & Yang, 2018) occur over the moist tongue region of West Antarctica, and it is possible that the sensitivity to moist physics (as seen for hits in Figure 9a) plays a role in the higher prediction accuracy of less stable boundary layers over this region. Even so, the repercussions of misrepresenting the boundary layer stability are more severe over West Antarctica, as the lower tropospheric cool bias is evidently larger below 1,500 m (as indicated in Figure 10a). Figure 10c shows an accompanying dry bias that is also more pronounced for West compared

Earth and Space Science
to East Antarctica, reaffirming that the cold bias in Figure 1 is related to increased surface radiative losses associated with reduced atmospheric water vapor path. The wind speed bias, on the other hand, is comparable for both regions (Figure 10b). Figure 11 further shows that compared to East Antarctica, a stronger negative lower tropospheric temperature bias in the model is observed across all categories (hits, misses, correct rejections, and false alarms) over Figure 9. (a) The relative contribution (%) to the absolute temperature tendency and (b) the temperature tendency (K s −1 ) at the lowest pressure level (within 100 m from the model surface) due to model physics (radiation, turbulence, moist processes, and friction), model dynamics, and analysis updates, as observed for hits (n = 16), misses (n = 17), correct rejections (n = 54), and false alarms (n = 13) based on a subsample of MERRA2 profiles (n = 100); and (c) same as (b) but for lapse-rate tendency (K s −1 ) calculated between the second lowest and lowest pressure levels. West Antarctica. During mixed conditions in the model, the tropospheric cool bias is amplified at the lowest levels in both East and West Antarctica (Figures 11c and 11d) and becomes significantly stronger compared to stable conditions ( Figure 11a). Conversely, during correct rejections, which is representative of stable boundary layers in the model and observations, the 2 m temperature bias is least negative likely due to the compensating effect of enhanced turbulent exchange in the Helfand and Schubert (1995) surface layer scheme . The mean bias at the 2 m level is substantially more negative during hits (~−4.4°C for both East and West Antarctica) compared to −0.03°C for West and 2.6°C for East Antarctica, respectively, during correct rejections. In general, a more dramatic mean 2 m temperature cold bias occurs over West Antarctica, with a maximum value observed during misses (−6.9°C) as seen in Figure 11b.
In the absence of other biases, the repercussions of missing mechanical mixing are expected to lead to an underestimation of near-surface temperatures (negative errors at the surface) and an overestimation of upper layer temperatures (positive errors at the top of the mixed layer). This is because mechanical mixing, which is considered responsible for SBI erosion in Antarctica (Ganeshan & Yang, 2018), is associated with the downward transport of heat from the overlying warm inversion towards the surface. In East Antarctica, where the model has a relatively weak lower tropospheric cool bias accompanied by a warm bias at the 2 m level during correct rejections (Figure 11a), the errors due to a lack of mixing results in a cancellation of these preexisting biases (Figure 11b). Over West Antarctica, on the other hand, the model has a significant lower tropospheric cold bias during stable conditions (Figure 11a), leading to a dramatic underestimation of near-surface 2 m temperatures (mean bias~=−7°C) when turbulent mixing is underpredicted ( Figure 11b). For example, Figure 12 shows the temperature profiles for two such missed cases that were observed over the West Antarctic peninsula. In the former case, the observed surface air temperature is close to ice melt conditions (~0°C). MERRA2 however, severely underestimates the 2 m temperature (~−14°C error). In this case, the model's failure in reproducing turbulence and associated warming evidently exacerbates the preexisting tropospheric cold bias in this region.

Conclusions
An accurate representation of the Antarctic atmospheric boundary layer temperature structure is paramount to modeling the regional energy budget and surface mass balance of the ice sheet (e.g., Giovinetto et al., 1990;King et al., 2001;Yang et al., 2014). Recent observations from highresolution dropsonde measurements during the austral spring season reveal that the Antarctic boundary layer is often well mixed (33% and 18% of the time over West and East Antarctica), with surface-based inversions being eroded as frequently as 47% and 30% in West and East Antarctica, respectively (Ganeshan & Yang, 2018). In this study, we evaluate the Antarctic boundary layer structure representation in MERRA2 adopting the classification used by Ganeshan and Yang (2018) and comparing against collocated and coincident dropsonde profiles.
MERRA2 has a good representation of the Antarctic surface stability in terms of predicting the overall frequency of stable (SBIs) and less stable (noSBIs and MLs) boundary layers ( Figure 3). The model evidently underpredicts the occurrence of less stable boundary layers over the East Antarctic Plateau (Figure 7), which are often mixed mechanically during katabatic winds (Ganeshan & Yang, 2018), indicating that the surface and turbulence schemes are less accurate under very stable conditions that occur over high-altitude regions of the continent. In addition, increased lower tropospheric stability due to model dynamics (i.e., cold air advection near the surface coupled with relatively warmer air advected aloft) is likely to contribute to the underprediction of mixed boundary layers in MERRA2 (Figures 9b and  9c). Parameterizing turbulence under very stable conditions is a well-known challenge for numerical weather prediction models. Part of the reason is that similarity theory breaks down for very stable boundary layers (Holtslag et al., 2013;Mahrt, 2014;Vignon et al., 2017). Another reason is the artificial enhancement of near-surface diffusion in stable boundary layers which is widely practiced for tuning the model to reduce near-surface temperature biases and to improve the representation of the large-scale flow (Sandu et al., 2013). In MERRA2, the cold bias in 2 m temperatures indeed appears to be reduced (sometimes reversed) due to the increased turbulent heat exchange in stable surface layers (Helfand & Schubert, 1995;Molod et al., 2015). The occurrence of false predictions of mixed layers in MERRA2, albeit less frequent (Table 1), is favored during conditions of weak net surface radiative cooling (Figure 9b).
Previous work suggests that well-mixed boundary layers occur frequently in the "moist tongue" region of West Antarctica (Ganeshan & Yang, 2018) and are possibly associated with cloud-turbulence interactions. The more accurate prediction of occurrence frequency of less stable boundary layers in West Antarctica ( Figure 6) and the apparent sensitivity to moist turbulent processes (Figure 9a) suggest that the model performs better in capturing these interactions. Even though the model has a better representation of boundary layer mixing over West Antarctica, the repercussions of misrepresenting the surface stability are more severe owing to a strong preexisting lower tropospheric cool bias (Figure 10a) which appears to be related to a dry bias in this region (Figure 10c). The failure to predict turbulent mixing in West Antarctica can further lead to dramatic underestimation of near-surface temperatures (e.g., Figure 12) that are critical for the representation of surface energy, ice mass balance, and the climate of Antarctica. Our results suggest that improvements are needed for the regional representation of Antarctic surface stability through advances in modeling and data assimilation as well as through improvements in parameterizing turbulence in very stable boundary layers.