An EnOI‐Based Data Assimilation System With DART for a High‐Resolution Version of the CESM2 Ocean Component

An ensemble optimal interpolation (EnOI) data assimilation system for a high‐resolution (0.1° horizontal) version of the Community Earth System Model Version 2 (CESM2) ocean component is presented. For this purpose, a new version of the Data Assimilation Research Testbed (DART Manhattan) that enables large‐state data assimilation by distributing state vector information across multiple processors at high resolution is used. The EnOI scheme uses a static (but seasonally varying) 84‐member ensemble of precomputed perturbations to approximate samples from the forecast error covariance and utilizes a single model integration to estimate the forecast mean. Satellite altimetry and sea surface temperature observations along with in situ temperature and salinity observations are assimilated. This new data assimilation framework is then used to produce a global high‐resolution retrospective analysis for the 2005–2016 period. Not surprisingly, the assimilation is shown to generally improve the time‐mean ocean state estimate relative to an identically forced ocean model simulation where no observations are ingested. However, diminished improvements are found in undersampled regions. Lack of adequate salinity observations in the upper ocean actually results in deterioration of salinity there. The EnOI scheme is found to provide a practical and cost‐effective alternative to the use of an ensemble of forecasts.


Introduction
Decadal climate (or Earth system) prediction, which focuses on climate changes on time scales from a year to a decade or more, has been one of the frontier fields in climate science since the early 2000s, mainly because of its potential value to inform, among others, environmental and socioeconomic decisions and policies on these time scales. Because decadal climate predictions are sensitive to both external forcings (including natural and anthropogenic) and internal climate variability, decadal climate predictions are a combination of forced boundary condition and initial value problems (Meehl et al., 2009). Despite substantial progress and availability of many decadal climate prediction experiments using fully coupled Earth system models from various modeling centers (e.g., Keenlyside et al., 2008;Mochizuki et al., 2010;Smith et al., 2007;Sugiura et al., 2009;Yeager et al., 2018), many scientific and technical challenges remain (Meehl et al., 2014). Nevertheless, meaningful prediction skill for several key climate fields, such as precipitation and upper-ocean heat content, as well as for ocean biogeochemistry has been found . Moreover, Smith et al. (2019) have recently concluded that climate on decadal time scales is more predictable than previously thought. While a vast amount of work is still needed, these studies were able to establish robust evidence of decadal prediction skill for surface temperature, precipitation, and pressure. They further showed that decadal predictions can capture many aspects of regional changes.
The state-of-the-art decadal climate predictions referenced above usually use relatively coarse (horizontal) resolutions of order 0.25°to 1°in their component models. Recently, there is a growing demand for more accurate and reliable predictions that include a broader range of space and time scales with a more complete and regional representation of weather, climate, and Earth system processes for a variety of applications. Meeting this demand will necessitate new approaches to forecasting that will require higher-resolution models that include, for example, mesoscale physics in their ocean components for both forecasting and assimilation systems. Such high-resolution global models are becoming more widely available with recent increases in computational resources.
Increased resolution is expected to lead to improvements in predictions. For example, Jia et al. (2015) found an improvement of the seasonal prediction of 2-m air temperature and precipitation over land and of the Nino-3.4 index using a high-resolution version of the Geophysical Fluid Dynamics Laboratory's climate model. Shaffrey et al. (2017) show that predictions based on HiGEM, a higher-resolution version of the HadGEM1 Met Office Unified Model, are significantly more skillful than predictions based on the lower resolution HadCM3 model, at lead times ranging from a year to a decade. Additionally, recent results from Siqueira and Kirtman (2016) indicate that when air-sea interactions associated with oceanic fronts and eddies are adequately resolved, more realistic variability of the ocean dynamics can enhance skill in near-term climate predictions. Using high-resolution and low-resolution predictions with the Community Climate System Model Version 4 (CCSM4) for drought prediction over the Southeast United States, Infanti and Kirtman (2019) found higher skill with the high-resolution version of the model for a 36-month prediction of the mean rainfall. These studies provide evidence that the skill of seasonal to decadal predictions can be increased and the representation of the climate system can be improved by using models with finer resolution.
A well-established source of predictability on decadal time scales comes from the initialization of the ocean state Doblas-Reyes et al., 2013;Matei et al., 2012;Robson et al., 2012;Yeager et al., 2012Yeager et al., , 2018. To fully exploit the new capabilities of the next generation of global high-resolution climate prediction systems, proper initialization of their eddy-permitting or eddy-resolving ocean components (0.1°or finer horizontal resolution) is required. This represents a major challenge. Specifically, such high-resolution coupled Earth system models are already computationally intensive. The associated cost becomes even more prohibitive when the computational demands of data assimilation approaches are included to provide the necessary initial conditions for these predictions. The latter cost is exceptionally high for state-of-the-art data assimilation methods like the ensemble Kalman filters (EnKF). The cost is further exacerbated when it comes to initialization of decadal prediction simulations. Indeed, a large number of initialization dates is required, typically covering several decades, to robustly evaluate the performance of the system and establish a bias adjustment procedure.
The Data Assimilation Research Testbed (DART; Anderson et al., 2009) framework implements a variety of ensemble filter methods. Performing a multidecade long data assimilation with an eddy-permitting or eddy-resolving ocean model with DART is also prohibitively expensive, remaining beyond the current NCAR computing capacity and probably beyond the capabilities of the next generation systems as well. To overcome this major obstacle, alternative data assimilation techniques can be considered. One such alternative is a relatively inexpensive ensemble optimal interpolation (EnOI) scheme. The EnOI scheme was first introduced by Evensen (2003) and can be seen as a low-cost approximation of the EnKF. An important distinction between the EnKF and EnOI is that the EnOI background covariance is either static or, more generally, not rigorously related to the model state. In other words, EnOI does not represent the specific errors of a given assimilation time but rather assumes that the background covariance matrices are state independent and are well represented by a stationary or seasonally varying ensemble. As a consequence, EnOI systems are immune from ensemble collapse but do not have an evolving estimate of the state error. Because of its simplicity of implementation, low computational cost, and many other attractive characteristics, such as quasi-dynamically consistent, multivariate, inhomogeneous, and anisotropic covariances, EnOI is a widely utilized ocean data assimilation method (e.g., Drevillon et al., 2008;Fu et al., 2011;Pan et al., 2014;Sakov & Sandery, 2015;Scott et al., 2018). While the EnKF has been shown to consistently outperform EnOI in a regional implementation in a 0.1°horizontal resolution version of the Modular Ocean Model Version 4 (MOM4) by Sakov and Sandery (2015), one must be cognizant of the fact that the reported, relatively modest improvements of 9-21% in forecast accuracy, as measured by the size of the innovation, come at a large computational cost. For example, Sakov and Sandery (2015) used 96 ensemble members, meaning that the EnKF system was roughly 96 times more expensive than the EnOI in their analysis.
The present work represents our first attempt at data assimilation in a high-resolution (nominal 0.1°h orizontal) version of the ocean component of the Community Earth System Model Version 2 (CESM2), using EnOI within the DART framework. The ocean model is the Parallel Ocean Program Version 2 (POP2; Danabasoglu et al., 2012;Small et al., 2014;Smith et al., 2010). This initial step focuses only on the ocean component, initialization of which has been established to be important for improved prediction skills. It serves our long-term goal of developing a proper initialization procedure for an eventual high-resolution Earth system prediction system based on CESM. In section 2, a new release of DART, namely, DART Manhattan, is introduced. This release provides a vastly improved memory scaling and is a necessary first step to accommodate the increase in the size of the state vector that comes with global high-resolution ocean data assimilation. Section 3 describes the EnOI scheme as implemented in DART. The observations assimilated are presented in section 4. The ocean model and the simulations are summarized in section 5. Section 6 introduces a prototype global eddy-permitting/eddy-resolving reanalysis for the 2005 to 2016 period and presents an evaluation of the system. Finally, section 7 provides a summary and concluding remarks.

Dart
The DART Manhattan release (Data Assimilation Research Testbed, 2019) provides capabilities to do ensemble data assimilation with high-resolution versions of CESM2. DART implements parallel ensemble filters using the algorithm described in Anderson and Collins (2007). A forward operator computes the expected value of an observation from an instrument given the forecast model state. An ensemble of forward operators for each observation is created prior to assimilation of these observations at a given time.
Previous versions of DART required that the whole model state vector for a given ensemble member was resident in the memory space of a single processor, known as a state complete representation. A forward operator could be computed in a straightforward fashion by directly accessing all the needed state variables. Once all forward operators were computed, the state vectors of all ensembles were transposed in a massive all-to-all communication so that each processor ended up with all the ensemble members of a subset of the state variables in its memory space. The actual data assimilation was done using this ensemble complete representation.
The state vectors of high-resolution CESM versions cannot fit in the memory of a single processor so the state complete representation is impossible. In the Manhattan version, the DART parallel implementation has been enhanced so that model state vectors are read from NetCDF files directly into the ensemble complete representation. A particular process is assigned to compute the entire ensemble of forward operators for a given observation. Since this process stores all ensemble copies for only a subset of the state variables, computing the forward operators now generally requires communication to obtain the value of all required state variables from the other processes that store them. DART Manhattan uses passive target one-sided Message Passing Interface (MPI) communication (Gropp et al., 1999), also referred to as Remote Memory Access (RMA), to allow the processor computing a forward operator to obtain the values of state variables that it does not store (Anderson et al., 2013). Since a single processor computes the entire ensemble of forward operators for a given observation, it is trivial to vectorize across the ensemble. Because RMA is one sided and requires no action by the processor storing a state variable, many processors can be computing ensembles of forward operators simultaneously. When all forward operators have been computed, the assimilation algorithm proceeds in the ensemble complete representation as before.
This new implementation in the DART Manhattan release has a number of benefits. It allows far larger model state vectors to be used with DART, increases the scalability of computing the forward operators and permits these computations to be vectorized, and eliminates the massive all-to-all communication required for the transpose from state complete to ensemble complete. Additionally, the DART Manhattan capability to directly read from or write to CESM NetCDF restart files removes the need for an intermediate file when passing data between CESM components and DART. Finally, each ensemble member is read in parallel by different tasks, significantly reducing the I/O time.

Implementing the EnOI Scheme Within DART for the CESM Ocean Component
In the context of this work, data assimilation seeks to infer an optimal estimate, in a least squares sense, of the evolving state of the ocean using a statistical combination of observations and a numerical model describing the evolution of the system over time. The Bayesian paradigm provides a coherent probabilistic approach for the data assimilation problem (Anderson, 2001). Ensemble methods of assimilation, such as the EnKF previously used for the assimilation of in situ ocean observations with DART into the nominal 1°POP2 ocean model (Karspeck et al., 2013), rely on the ensemble approach and assume that the prior probability distribution can be estimated from the statistics of a finite sample of nonlinear ensemble forecasts. The EnKF has gained popularity because of its simple conceptual formulation and relative ease of implementation (e.g., it requires no derivation of a tangent linear or adjoint models). But by definition it requires the integration of an ensemble of models which, in the context of a 0.1°eddy-permitting or eddy-resolving ocean model, is computationally prohibitive for long analysis periods of a decade or more. Thus, in this work, the EnOI scheme (Evensen, 2003) is implemented to assimilate observations much more efficiently in the high-resolution version of POP2.
As implemented within DART, the EnOI scheme uses a static, but seasonally varying, ensemble of 84 precomputed perturbations (see section 5 for calculation of these perturbations) to approximate samples from the forecast error covariance and uses a single model integration to estimate the forecast mean as  Anderson et al., 2009). An estimate of the model state at time t k is advanced to time t k + 1 by a forecast model ①. A stationary ensemble (4 in this example) of model anomalies is then used to approximate forecast errors ②. A forward observation operator, h, is applied to each state vector ③ to obtain four estimates of an observation ④ denoted by green tick marks. The observed value and the observational likelihood (red tick mark and red curve in the observation space portion of the schematic) are combined with the prior ensemble estimate (green curve) to obtain an updated ensemble estimate ⑤ and increments ⑥ (blue arrows in the observation space portion of the schematic). The increments to the observation ensemble are regressed onto each state vector component ⑦ independently to generate state vector increments (blue arrows in the model space portion of the schematic). The posterior mean (blue asterisk) is computed ⑧ by averaging the posterior state vector. The model is then used to advance the posterior mean state estimate ⑨ to time t k + 2 when the next observations become available. schematically illustrated in Figure 1. Use of 84 members represents a trade-off between ensemble size and the amount of memory required for the analysis step.
The sequential algorithm used in EnOI is very similar to that of the ensemble adjustment Kalman filter (EAKF) already available in DART (Figure 1; Anderson, 2009). Starting with a model state vector x k at time t k , the model produces a forecast x k+1 at time t k+1 (① in Figure 1). An N-member ensemble of model anomalies is used to approximate forecast errors (② in Figure 1) and create an ensemble of prior x p,n . The sequential EnOI algorithm then applies the scalar forward operator h to each sample of the state (③ in Figure 1), resulting in the ensemble of prior estimates for the observation (green tick marks in Figure 1). The sample mean y p and variance σ 2 p of the prior estimate of the observation are computed (④ in Figure 1; green curve). Given the observed value y o and the observational error variance σ 2 o (④ in Figure 1; red tick mark and curve), the product of the prior and the likelihood yields an updated estimate with variance The updated ensemble estimate (⑤ in Figure 1; blue curve) for y given by y u; n ¼ σ u =σ p À Á y p; n − y p þ y u is constructed by shifting the mean and linearly contracting the members to make the sample variance exactly σ 2 u . An ensemble of observation space increments is defined as Δy n = y u,n − y p,n (⑥ in Figure 1; blue arrows).
The increments for each state vector component are computed independently by regressing the observation space increments onto the state vector component (⑦ in Figure 1) using the prior joint ensemble sample statistics so that Δy n where Δx m,n is the increment for ensemble member n of state vector component m, while σ xm,y is the prior sample covariance of state vector component m and y. The sequential EnOI algorithm then evaluates the posterior mean x u; n by averaging the posterior state vector (⑧ in Figure 1; blue arrows). Finally, the model advances the posterior mean state estimate to time t k+2 when the next observations become available (⑨ in Figure 1).
The EnOI sequential algorithm eliminates the cost of running an ensemble of global high-resolution forecasts as part of the cycled assimilation. As such, the EnOI has a computational cost about N times less than that of the EnKF, where N again is the ensemble size. This is especially meaningful in the context of a global high-resolution ocean data assimilation experiment, where the forward model is expensive to run. The posterior mean is estimated by averaging the posterior state vectors and is then used as the initial state for the next forecast step of the cycled assimilation.
Because of the limited size of the static ensemble employed by EnOI, the background covariance matrices are rank deficient. This results in nonnegligible correlations between widely separated variables, which are believed, a priori, to be uncorrelated (Anderson & Collins, 2007). A remedy for this problem is to use a localization function to restrict the impact of an observation on geographically distant state variables (Houtekamer & Mitchell, 2001). Here, a compactly supported fifth-order piecewise polynomial localization function (Gaspari & Cohn, 1999) with a radius of~600 km in the horizontal is used. No localization is applied in the vertical, allowing observations at any depth to impact the entire water column.
The data assimilation algorithm also requires estimates of the error variance associated with the observations that are being assimilated. The error includes the instrumental error and the error due to unresolved dynamics in the model-usually referred to as the "representativeness error." The representativeness error is the dominant observational error source in current ocean models (e.g., Oke & Sakov, 2008). The true representativeness error is a complex function of the model resolved versus unresolved dynamics at the geographic location of the observations. For simplicity, however, a single crude error estimate is utilized in the current implementation, and it is used globally. For the altimetry, the standard deviation of the error is set at 5 cm across all platforms. For temperature, the standard deviation of the error is set at 0.5°C for all types of temperature observations. For salinity, the standard deviation of the error is set at 0.5 psu.

Observations
Four sets of observational data sets are used: dynamic topography (DT), sea surface temperature (SST), in situ temperature, and in situ salinity. All the observations are aggregated over a 1-day window and assimilated as if they are instantaneous observations at 00z (UTC). Figure 2 illustrates a typical set of observations for a given day (1 March 2005).
The DT is the relevant altimetric signal for assimilation into an ocean model. Satellite altimeters sense the sea surface height (SSH). The SSH is the elevation of the sea surface above a reference ellipsoid. It has two components: (i) the DT which represents the signature of the ocean circulation and (ii) the geoid which reflects the variation of the Earth's gravity field. Ocean models use a uniform gravity field and as a result SSH and DT are the same surface in the model "world." For consistency we will use the term DT throughout this manuscript recognizing that ocean modelers usually favor the usage of SSH.
For this application, the DT is constructed as the sum of the sea level anomalies (SLA) and the mean dynamic topography (MDT). We use the along-track SLA distributed by the Copernicus Marine Environment Monitoring Service (CMEMS; https://www.copernicus.eu), a product formerly provided by Aviso+. By definition, the SLA represent the variable part of the altimetric signal. The CMENS SLA is computed relative to the 20-year mean for the 1993-2012 period, and all the missions, that is, Jason-3, Sentinel-3A, HY-2A, Saral/AltiKa, Cryosat-2, Jason-2, Jason-1, T/P, ENVISAT, GFO, and ERS1/2, are homogenized with respect to a reference mission and processed by the DUACS multimission altimeter data processing system. The global MDT CNES-CLS13 (Rio et al., 2014) is used to reference the SLA to obtain the DT, which represents the absolute signal that results from the ocean circulation. An accurate knowledge of the MDT is the key for the optimal exploitation of altimeter data through assimilation. The ocean MDT is the difference between the Mean Sea Surface Height (MSSH), that is, the time average of the sea level above a reference ellipsoid, and the geoid height above the same reference ellipsoid. As a result of the recent dedicated space gravity missions such as GRACE (Gravity Recovery and Climate Experiment; Tapley et al., 2004) and GOCE (Gravity field and steady-state Ocean Circulation Explorer; Drinkwater et al., 2003), the knowledge of the geoid has greatly improved in the past few years, so that the ocean MDT is now resolved with centimeter-scale accuracy at spatial scales of around 100-150 km. However, at scales shorter than 100 km, spatial filtering is still needed because of the spectral differences of the two surfaces. Specifically, while MSSH is known with centimetric accuracy at scales of a few km, the geoid models only achieve this precision for scales larger than 100 km. MDT information at scales shorter than 100 km are added using oceanographic in situ information such as Argo floats and drifting buoy velocities (see Rio et al., 2011, for details).
In situ temperature and salinity observations are from the World Ocean Database 2013 (WOD13; Boyer et al., 2013). WOD13 provides a uniform and quality-controlled access to a large number of data sets and integrates ocean profile data from approximately 90 countries around the world, collected from buoys, ships, gliders, and other instruments. While WOD13 is comprehensive and now includes order 15 million temperature profiles, the coverage is irregular, both in space and time as illustrated by Figure  Finally, the NOAA High-Resolution Optimum Interpolation SST V2 (OISST) data set (Reynolds et al., 2007) is also assimilated. This observational product provides a complete 0.25°daily SST analysis constructed by combining observations from different platforms (Advanced Very High-Resolution Radiometer [AVHRR] satellites, ships, and buoys) and by interpolating to fill in gaps from the clouds.

Ocean Model and Simulations
In its high-resolution version, POP2 uses a tripolar grid with the two northern grid poles located in Alaska and Russia to overcome the geographical North Pole singularity while maintaining a more isotropic resolution in the northern high latitudes. The nominal horizontal resolution is 0.1°. To reduce the computational cost of the model, the present study uses 42 levels in the vertical, in contrast with the 62-level version used in Small et al. (2014), with 10-m resolution near the surface monotonically increasing to 250 m in the abyssal ocean. The bathymetry of the model is derived from ETOPO2v2 Global Relief Model (https://www.ngdc. noaa.gov/mgg/global/etopo2.html). The model uses partial bottom cells (Adcroft et al., 1997) for a more accurate discretization of the bottom topography. Ocean turbulent mixing is parameterized using the K-profile parameterization (KPP; Large et al., 1994) in the vertical. A biharmonic operator with hyperviscosity and diffusivity scaled by the cube of the local grid spacing is used as closure for lateral mixing by subgrid-scale processes. While freshwater fluxes from river runoff are still incorporated as virtual salt fluxes that handle river inputs by removing salt from the ocean surface instead of adding freshwater volume, a local reference salinity, rather than a global one, is applied. Additionally, these freshwater fluxes are distributed in the vertical over the upper two surface layers of the model, that is, upper 20 m. Both of these approaches are enabled by the newly implemented Estuary Box Model (EBM, Sun et al., 2017) and have been shown to reduce biases in the simulated salinity near river mouths (Tseng et al., 2016).
Surface forcing is provided by the new atmospheric data sets for driving ocean-sea ice models based on the Japanese 55-year atmospheric reanalysis product (JRA-55-do; Tsujino et al., 2018). JRA55-do data sets have a 55-km spatial and 3-hourly temporal resolution. Such a finer spatial and temporal resolution, compared to the previously used Coordinated Ocean-ice Reference Experiments (CORE) interannual forcing Version 2 data sets, is particularly beneficial for high-resolution simulations like the one used in the present study. Bulk formulae from Large and Yeager (2009) are used to compute air-sea heat and momentum fluxes. The model sea surface salinity (SSS) is restored toward the upper 10-m average, monthly mean climatological salinity from the World Ocean Atlas 2013 Version 2 (Zweng et al., 2013), using a piston velocity of 50 m over 1 year. It is standard practice to apply such SSS restoring when forcing an ocean-sea ice model (Griffies et al., 2009), and it is needed to maintain a global hydrological balance in order to avoid model drift in the absence of coupled feedbacks.
To obtain the static ensemble perturbations and also to have a baseline (control) simulation to evaluate our assimilation, a free-running, ocean-sea ice hindcast simulation in which no observations were ingested (denoted as NoAssim hereafter) was performed. An ocean-sea ice coupled version of CESM2 was used, where the Community Ice CodE (CICE5) was employed as the sea ice component (Hunke et al., 2015). CICE5 is a dynamic-thermodynamic sea ice model that includes a subgrid-scale ice thickness distribution and uses the same tripolar grid as in POP2. The simulation was initialized on 1 January 2000 using oceanic and sea ice initial conditions from year 30 of an existing ocean-sea ice coupled simulation obtained with a repeat-year forcing data set (Bryan & Bachman, 2015). It was then integrated for 17 years for the 2000-2016 period forced with the JRA55-do data sets. The 84-member ensemble of precomputed perturbations used by the EnOI to approximate samples from the forecast error covariance was constructed by randomly drawing 7 days from each individual month over the 12-year period for 2005-2016 for a total of (12 × 7 = 84) 84 members per month (January to December). As a result, the ensemble used by the EnOI varies seasonally but does not have any interannual variations. We do not interpolate in time through a month, and there is a discontinuous switch in the prior ensemble at each month boundary. NoAssim simulation was also used to evaluate the long-term spatial average of the model MDT which is needed to account for the difference in the arbitrary reference level used by the model and the observed CNES-CLS13 MDT. Finally, NoAssim simulation was utilized to initialize our prototype global eddy-resolving/eddy-permitting POP2 ocean model reanalysis on 1 January 2005. The retrospective analysis (denoted as Assim hereafter) was run for 12 years from January 2005 to December 2016. It was configured and forced like NoAssim but used the EnOI system described above to ingest satellite altimetry and SST observations along with temperature and salinity in situ observations.

Evaluation of the High-Resolution Data Assimilation
In this section, a brief evaluation of the retrospective analysis from the global high-resolution data assimilation system for the 12-year period is presented, considering only a few basic fields. For this purpose, biases from available observations for the Assim analysis and the free-running NoAssim simulation introduced in the previous section are compared to each other. To allow such comparisons between the Assim and NoAssim and various standard gridded observational products, the simulated fields are interpolated from the 0.1°POP grid to a regular grid used by each observational product.

Journal of Advances in Modeling Earth Systems
warm bias off the South African coast associated with the Benguela upwelling system. Not surprisingly, Assim eliminates or substantially reduces these biases. There is also a modest reduction in the large-scale cold bias particularly evident in the Pacific and Indian Basins in NoAssim. These improvements are reflected in the globally averaged root-mean-square (rms) bias of 0.3°C for Assim, representing a significant reduction from 0.45°C in NoAssim. Nevertheless, there are still remaining biases in Assim, that is, along the Kuroshio Current and the Sub-Antarctic front in the Southern Ocean, that are comparable to those of NoAssim. Consequently, there is only a minor improvement in the global mean SST in Assim (19.2°C) over NoAssim (19.1°C), both lower than the Argo observational estimate of 19.6°C.
The mean DT for the 2005-2016 period for Assim and NoAssim is presented in Figure 5. The figure also includes mean biases against the CMENS gridded multimission DT. In general, NoAssim has large basin-scale negative differences, particularly in the Pacific Ocean, associated with cold biases in the deep ocean (not shown). There are also some large biases in the Southern Ocean and in the vicinity of western boundary currents. These differences are related to the biases in the positions of these energetic structures, that is, the Gulf Stream/North Atlantic Current, the Kuroshio, and the Antarctic Circumpolar Current. The

Journal of Advances in Modeling Earth Systems
positive differences in the equatorial and tropical oceans are associated with the warm biases in the upper ocean (see Figure 6). Globally, the assimilation is able to reduce the differences with the observed DT, with the rms bias of 8.6 cm in NoAssim down to 7.8 cm in Assim. However, this is only a small reduction, and the assimilation actually deteriorates the DT in the Southern Ocean with large negative differences in the South Indian Ocean and Agulhas Current region. These energetic regions, in which high eddy activity drives a lower signal-to-noise ratio, might not be sufficiently sampled by Argo (see Figure 3), leading to relatively large errors, particularly, in the deep ocean. Indeed, Assim has a large cold bias below 1,000-m depth (not shown) in the same regions where the DT displays a negative bias ( Figure 5). As stated in section 3, no localization is applied in the vertical, allowing altimetry observations to impact the entire water column. This, combined with the sparsity of in situ observations in those regions, implies that the covariances from the high-resolution model used to extend the altimetry observations to the ocean interior temperature and salinity are still problematic.
Proper initialization of the upper-ocean heat content has been shown to improve prediction skill in the North Atlantic (e.g., Yeager et al., 2012, and2018). Therefore, it is crucial for any data assimilation product intended for initialization of such prediction simulations to have a faithful representation of the upper-ocean temperatures. Such an assessment for Assim is presented in Figure 6, showing the 0-to 250-m depth-averaged mean potential temperature distributions in comparison to those of NoAssim and the Roemmich and Gilson (2009) gridded Argo data set. NoAssim shows a ubiquitous warm bias with the largest differences found in the eastern tropical basins. The Gulf Stream and Kuroshio also show warm biases along their northern fronts, consistent with the fact that even at an eddy-permitting resolution, the western boundary currents tend to overshoot and flow too far north along the continental slopes. As seen with SSTs, there is an associated, large cold bias to the east of Newfoundland due to the overly zonal North Atlantic Current. With assimilation, all these biases are significantly reduced. Quantitatively, the rms error is reduced from 1°C in NoAssim to 0.6°C in Assim. This improvement is due to a large-scale cooling in Assim, with the Pacific Ocean showing some cold biases. The mean 0-to 250-m depth-averaged potential temperature in Assim is 15.4°C, matching the value from Argo. In contrast, NoAssim is warmer with a value of 16°C.
The time series of the annual-mean upper-ocean heat content (down to 250-m depth) are shown in Figure 7 for the subpolar North Atlantic (SPNA) region, defined as the area between 45-20°W and 50-60°N. The figure confirms the improvement achieved by the assimilation. Specifically, Assim has a mean heat content of 50.97 × 10 22 J for the 2005-2016 period, which is in much better agreement than that of NoAssim with the Argo estimate of 51 × 10 22 J. NoAssim significantly underestimates the heat content with a mean of 50.73 × 10 22 J, reflecting the cold bias seen over the SPNA region ( Figure 6). The variability of the heat content is very well captured by both Assim and NoAssim, with the assimilation marginally improving the Pearson's correlation coefficient from 0.94 for NoAssim to 0.96 for Assim.
A reduction in the tropical Pacific mean state biases is also known to lead to an improved representation of the El Niño-Southern Oscillation (ENSO) variability that, in turn, can produce better forecast skill

Journal of Advances in Modeling Earth Systems
(e.g., Kim et al., 2017;Manganello & Huang, 2009;Richter et al., 2018). Figure 8 shows the mean upper-ocean potential temperature differences for Assim and NoAssim from the gridded Argo data set (Roemmich & Gilson, 2009) along the Equatorial Pacific. In NoAssim, there is a substantial warm bias in excess of 2.4°C to the east of the dateline, associated with a very diffuse thermocline. In contrast, Assim shows a vastly improved mean upper-ocean thermocline structure with its 20°C isotherm tracking the observations very closely. The bias magnitudes are also sharply reduced, with the largest bias magnitudes down to around 0.6°C.
The above results clearly demonstrate improvements in SST and DT with Assim. These improvements are most likely due to the availability of altimetry and remote SST data sets from satellite microwave radiometers, providing a global synoptic view of the ocean surface, and a strong constrain on the Assim SST and DT. In contrast, SSS is poorly observed. The recent satellite ESA Soil Moisture and Ocean Salinity (SMOS), NASA Aquarius SAC-D, and Soil Moisture Active Passive (SMAP) missions have made it possible for the first time to measure SSS from space and can bring a valuable additional constraint to control the model salinity. However, satellite salinity observations are still relatively new and only available for the most recent years. Moreover, satellite salinity observations still contain large errors in coastal oceans and high latitudes (e.g., Vinogradova et al., 2014). In Assim, the corrections in the near-surface salinity evaluated by the assimilation scheme heavily rely on the multivariate covariances that relate DT and SST observations to salinity innovations. Figure 9 shows the global-mean potential temperature and salinity model minus Argo difference and rms error profiles for the 2005-2016 period for Assim and NoAssim. Consistent with the results presented above, the assimilation is able to reduce the error in temperature at the surface, and also at depth, particularly around 100-m depth. However, for salinity, while the assimilation has little effect at depth, it significantly degrades the solution in the upper 100 m or so by further freshening an already fresh-biased upper ocean in NoAssim. Because salinity is dynamically relevant, degrading the salinity state can lead to errors in the velocity field as illustrated by Vialard et al. (2003).
The Atlantic meridional overturning circulation (AMOC), representing zonally integrated circulation, is thus also constrained by salinity (e.g., Huang et al., 2012). Because the current implementation of the EnOI is found to degrade upper-ocean salinities, we expect poor AMOC representation in our prototype reanalysis. Indeed, this is confirmed in Figure 10 that shows the AMOC time-mean cell distribution for Assim and NoAssim.
There are significant differences in the representation of the primary circulation pattern associated with the North Atlantic Deep Water (NADW) cell (positive contours) with a stronger and deeper NADW transport in Assim. Furthermore, the NADW cell shows multiple distinct local maxima at different latitudes in Assim, lacking the meridional coherency seen in NoAssim. The strength of the deep ocean counter circulation (negative contours) associated with the northward flow of the Antarctic Bottom Water (AABW) intruding from the Southern Hemisphere is also much stronger in Assim. Overall, our results appear to be consistent with some of the AMOC features described by Karspeck et al. (2017) in their AMOC intercomparison in ocean reanalysis products and confirm that the historical reconstruction of AMOC is very sensitive to the details of assimilation procedures.
More in-depth analysis of Assim is not very meaningful until we can address the salinity issue and subsequently improve the AMOC representation.

Summary and Discussion
The DART Manhattan version includes new software infrastructure that enables ensemble data assimilation with high-resolution models, including CESM. This new version uses passive target one-sided MPI communication to enable large-state ensemble data assimilation by distributing state vector information across multiple processors on different MPI tasks, effectively relaxing the memory limitations inherent to the state complete representation paradigm used in previous DART versions. To achieve an affordable data assimilation system for a high-resolution version of the CESM2 ocean component, an EnOI scheme has been implemented within the DART Manhattan version. The EnOI scheme uses a static (but seasonally varying) ensemble of precomputed perturbations to approximate samples from the forecast error covariance and utilizes a single model integration to estimate the forecast mean. As a result, the computational cost of the EnOI is much less than the cost of the EnKF typically implemented with DART, making the EnOI a practical alternative for applications where computational cost is a limiting factor such as global high-resolution ocean reanalysis. We estimate the cost of running the global high-resolution retrospective analysis presented in this manuscript at about 600 K core hours per simulation year on the 5.34-petaflops Cheyenne supercomputer (an SGI ICE XA cluster with 145,152 Intel Xeon processor cores and 313 TB of total memory) at the NCAR Wyoming Supercomputer Center. Had we used the full EnKF scheme, the cost per simulation year would have been of the order of tens of millions of core hours depending on the size of the ensemble used by the EnKF to approximate the prior probability distribution. In its current implementation the EnOI system assimilates satellite altimetry and SST observations along with temperature and salinity in situ observations.
The new data assimilation framework is used to produce a global high-resolution retrospective analysis for the 2005-2016 period with the CESM2 ocean component. The assimilation is shown to improve the time-mean ocean state estimate relative to an identically forced ocean model simulation where no observations are ingested. Most of the improvements occur in the upper ocean where Argo and other in situ observations from the WOD13 are available. However, highly energetic regions, such as the western boundary currents and the Antarctic Circumpolar Current where high eddy activity drives a lower signal-to-noise ratio, still show notable biases because these regions are likely insufficiently sampled by Argo and other types of in situ observations available in the WOD13. Despite the recent significant increase in in situ observations with the Argo program, another undersampling related issue is seen in the upper-ocean salinities. Specifically, near the surface, where salinity is mostly controlled by surface flux exchanges rather than a temperature-salinity relationship, the salinity corrections inferred by the EnOI scheme lead to a significant deterioration of salinity in the mixed layer. Indeed, assimilation further freshens the upper ocean which is already too fresh in the simulation without data assimilation. Capturing the observed salinity state has always been a challenge for global ocean data assimilation systems because salinity data are sparse compared to temperature data (e.g., Chang et al., 2011). One potential explanation for the poor surface salinity in Assim is as follows. By sampling the forecast error covariance from a long, free-running model simulation, we likely tend to overestimate the forecast error. As a result, the assimilation scheme gives too much weight to the observations compared to the model forecast. Since we have plenty of SST observations and only sparse SSS observations to constrain the posterior, the assimilation update tends to overfit the posterior to the observed SST and infers SSS innovations from unreliable multivariate covariance between salinity and temperature near the surface. If this is indeed the case, a potential way to improve the surface salinity in Assim would be to use the Adaptive Inflation Algorithm proposed by El Gharamti (2018), which rectifies the ensemble variance using inflation or deflation as needed. We note that the more operational data assimilation systems tend to ingest salinity climatology to weakly constrain salinity (and temperature) to limit drifts and avoid the kind of issues seen in our results (e.g., Lellouche et al., 2018).
It is well known that univariate assimilation techniques can have a detrimental effect on the ocean state variables not directly constrained by the data (e.g., Ji et al., 2000;Troccoli et al., 2002). Multivariate assimilation methods, like the EnOI scheme used in this study, can, in theory, offer an answer to this problem, if the covariances used to propagate information from observed state variables to unobserved state variables are well known. Our results, with Assim showing negative DT bias collocated with cold bias at depth, suggest that even at high resolution, the covariances diagnosed from our static but seasonally varying ensemble of model states are not accurate enough to project the satellite observations to depth and infer meaningful temperature and salinity increments below the thermocline in regions with strong mesoscale activity. This limitation is potentially due to the limited ensemble size used by the EnOI scheme, but the physical memory limitation on our current computer system does not permit the use of a larger ensemble size. Indeed, the sample covariance can be suboptimal as a result of the limited ensemble size. One common strategy to remedy the sampling error issue when implementing an ensemble method in a high dimensional geophysical application is covariance localization. However, the issue of vertical localization for vertically integrated quantities, such as DT, is not straightforward. A number of studies have related localization to the correlation between an observation and a given state variable (e.g., Anderson, 2012). Vertically integrated quantities are expected to have a meaningful correlation with state variables, such as temperature and salinity, over the whole column, but the correlation will be a function of the state variable, location, and depth. Therefore, it can be expected that the localization function should also vary accordingly with those variables. Lei et al. (2020) have recently proposed an adaptive localization approach to estimate an effective vertical localization to assimilate satellite radiance observations, which measure integrated quantities over an atmospheric column. This adaptive method is based on sample correlations between ensemble priors of observations and state variables, aiming to minimize sampling errors of estimated sample correlations. It will be very interesting to implement this kind of adaptive localization approach in our EnOI scheme to see if we can improve the quality of our results. Finally, we note that another limitation could also come from the way our static ensemble used by the EnOI to approximate samples from the forecast error covariance was constructed. It will require further testing to assess if different ways of sampling the model internal variability to parameterize the EnOI could improve the results.
The EnOI method can be used to extend our reanalysis further back in time. Although the approach itself would not change in such an application, some details of the data assimilation, such as localization, would change to account for decreases in available observations. The lack of satellite altimetry prior to the 1990s and satellite SST prior to the 1980s will present significant challenges.
Our ultimate goal is to create a seamless Earth system prediction framework within CESM that enables initialization through data assimilation with DART. The experience gained from this initial effort with a high-resolution ocean model version will guide our efforts to improve the quality and capabilities of our system in the future.