Volume 125, Issue 9 e2020JC016428
Research Article
Free Access

A New Approach for Estimating Salinity in the Southwest Atlantic and Its Application in a Data Assimilation Evaluation Experiment

G. S. Dorfschäfer

Corresponding Author

G. S. Dorfschäfer

Oceanographic Modeling and Observation Network (REMO), Center for Research in Geophysics and Geology, Federal University of Bahia (UFBA), Salvador, Brazil

Graduate Program in Geophysics, Federal University of Bahia (UFBA), Salvador, Brazil

Correspondence to:

G. S. Dorfschäfer,

[email protected]

Search for more papers by this author
C. A. S. Tanajura

C. A. S. Tanajura

Oceanographic Modeling and Observation Network (REMO), Center for Research in Geophysics and Geology, Federal University of Bahia (UFBA), Salvador, Brazil

Graduate Program in Geophysics, Federal University of Bahia (UFBA), Salvador, Brazil

Department of Earth and Environmental Physics, Physics Institute, Federal University of Bahia (UFBA), Salvador, Brazil

Search for more papers by this author
F. B. Costa

F. B. Costa

Oceanographic Modeling and Observation Network (REMO), Center for Research in Geophysics and Geology, Federal University of Bahia (UFBA), Salvador, Brazil

Graduate Program in Geophysics, Federal University of Bahia (UFBA), Salvador, Brazil

Search for more papers by this author
R. C. Santana

R. C. Santana

Oceanographic Modeling and Observation Network (REMO), Center for Research in Geophysics and Geology, Federal University of Bahia (UFBA), Salvador, Brazil

Search for more papers by this author
First published: 18 August 2020
Citations: 3

Abstract

A new method for generating synthetic salinity (SS) profiles in the Southwest Atlantic was developed and applied in data assimilation experiments. This method was based on the smallest integrated values of root mean square deviation (RMSD)—with respect to observations—to infer salinity through climatological data and by regressive methods on temperature (T) using a five-order polynomial function (P5). In the 14 delimited subregions, the averaged RMSD of P5 was 45% smaller than interpolated climatological data. However, climatological salinity presented better results in the first top layers while P5 presented smaller errors in higher depths. Therefore, by joining the best that P5 and climatology may offer, a new hybrid approach was used to generate SS based on T from XBT profiles. The SS would allow more T profiles to be employed in the Oceanographic Modeling and Observation Network (REMO) data assimilation system, called RODAS, into the Hybrid Coordinate Ocean Model (HYCOM). The use of SS estimates has potential to improve model outputs, in which the presence of the T-S pair is quite necessary. Three integrations were performed: one run without assimilation (FREE), one assimilating sea surface temperature, Argo profilers, and sea level anomaly (RODAS) and one similar to RODAS, but with added XBTs with SS (RODAS_XBT). The inclusion of XBT data in the HYCOM + RODAS system improved the position and magnitude of the Brazil Current (BC). It was shown that SS is feasible for producing ocean reanalysis and initial conditions for ocean forecast systems requiring very low computational cost.

Key Points

  • A new hybrid approach for estimating salinity from temperature profiles in the Southwest Atlantic was developed
  • This method showed better responses than obtained only by regression with temperature data or by climatological data
  • Data assimilation of XBTs with synthetic salinity into the HYCOM+RODAS was able to improve the system

1 Introduction

The thermohaline state of the ocean contains a lot of information about its anisotropic structure and circulation. Therefore, temperature and salinity have been the most important observations collected around the globe. These variables can affect marine life and have been used for a variety of purposes, including improving numerical outputs when used in conjunction with data assimilation schemes. Temperature and salinity herewith pressure can be used to express the local density of seawater—a key variable in ocean processes. Although the relationship between these two parameters is specific for each region (Emery & Dewar, 1982), their conjoint use has been utilized since the last century to characterize the oceans' water bodies (Sverdrup et al., 1942).

According to Hansen and Thacker (1999), temperature measurements are cheaper and easier to sample than salinity's. Temperature is considered one of the main variables of atmosphere-ocean interaction and, also, the dominant variable for determining sound speed in the water column. Moreover, in low and middle latitudes, the density of seawater is largely dominated by temperature. Salinity measurements, in contrast, have always been more challenging, as they are indirect measurements of seawater electrical conductivity. Thus, considering that has been more difficult to understand the role of salinity in the oceans, researchers presented more solutions to deal with temperature on numerical models used in conjunction with data assimilation (Troccoli et al., 2002). This scenario was kept until the launch of the Argo (Array for Real-time Geostrophic Oceanography) program (Roemmich, 2009). Since the early 2000s, this rich database has been providing key information to better understand the ocean climate and to improve all operational and nonoperational data assimilation systems (Oke et al., 2015; Riser et al., 2016; Tanajura et al., 2020).

Before Argo period, the expendable bathythermographs (XBTs) were the most widely used form of ocean sampling. In the mixed layer (ML), temperature, and salinity variations are much more sensible to the seasonality, which is responsible for changes in the pluviosity rates and for changes in the heat stored by the oceans. These fluctuations prevent good correlation between these two variables (Chen & Geng, 2018; Marrero-Díaz et al., 2001). Below the ML depth, in the thermocline and deeper zones, there are areas where temperature variations are very well correlated with salinity variations, and the temperature-salinity (T-S) relationship is almost linear. Based on this relation, it is possible to estimate synthetic salinity (SS) profiles to accompany the temperature profiles. These profiles can be inferred using climatological data sets, surface salinity, depth, temperature, latitude, longitude, or any other predictive parameters (Hansen & Thacker, 1999).

Many authors (Goes et al., 2018; Han et al., 2004; Hansen & Thacker, 1999; Korotenko, 2007; Thacker, 2006, 2007; Vossepoel et al., 1999) proposed methods for estimating salinity profiles. Stommel (1947) was the first to study the T-S relationship. He considered that salinity variability was due to vertical displacements of water. Emery (1975) obtained satisfactory results using the T-S relation to estimate salinity and then perform dynamic height calculations with XBT data. Emery and Wert (1976) found out that their dynamic height calculations diverged from observations mainly in areas close to the limits of water masses, that is, regions of intense thermohaline gradient.

Since Stommel (1947), one of the most used approaches for estimating salinity were those that take climatological values into account. Emery and O' Brien (1978) suggested the use of mean salinity profiles, and they were able to reduce the root mean square deviations (RMSDs) over the conventional T-S method. Emery and Dewar (1982) took turns on using of T-S relationship and vertical profiles of mean salinity based on the lowest salinity RMSDs found to calculate dynamic heights in the North Atlantic and North Pacific. Utilizing polynomial relationships, Marrero-Díaz et al. (2001) performed calculations of density, dynamic height, and geostrophic velocity using only XBT data. They found the best results below 150 m and warned for the presence of mesoscale structures such as eddies and meanders, which may modify the position of the ML depth. They concluded that their procedure was effective and feasible to obtain optimal descriptions of local dynamics based only on temperature observations.

In a recent publication, Goes et al. (2018) estimated salinity from XBTs temperature using multivariate linear regression in the Atlantic domain. Their predictors consisted of temperature, depth, squared temperature, and annual and semiannual harmonics representing seasonality. Comparing their algorithm with the one proposed by Thacker (2007), they showed that the substitution of some predictors by the seasonal information can improve the salinity estimates mainly in the upper ocean until 150 m, where seasonality plays an important role.

Focusing on the assimilation of XBT data, Hansen and Thacker (1999) elaborated an algorithm to estimate salinity by regressive methods taking into account parameters such as temperature, surface salinity, and latitude. This approach was sufficiently effective to produce reliable salinity estimates even in the presence of barrier layers, regions with a halocline within the thermal ML (de Boyer Montégut et al., 2007). In order to obtain a good temperature field when only temperature is assimilated, Troccoli et al. (2002) found out that it may be necessary to correct the salinity profile in a univariate optimal interpolation scheme. Their adjustment method used local T-S relation from the background state to correct salinity, so corrections were made after the calculation of the objective analyses. They showed that updating temperature and not modifying salinity led to the generation of unrealistic water masses, which corrupted the model state. Ricci et al. (2005) applied the salinity adjustment method developed by Troccoli and Haines (1999) and discussed how the T-S relation could be introduced as a multivariate constraint in a 3DVAR scheme for an ocean general circulation model. Even assimilating only T data, this approach allowed the simultaneous correction of T and S and also reduced the representation of a strong bias associated with artificial geostrophic currents in the Pacific Ocean. While many aspects of the ocean analysis have been improved, they found some degradation in near-surface salinity and very large vertical mixing coefficients, thus suggesting the need for improvement in their assimilation system.

In Brazil, the Oceanographic Modeling and Observation Network (REMO) has been pioneer on the study and implementation of an oceanic operational forecast system. The REMO Ocean Data Assimilation System (RODAS) (Mignac et al., 2015; Tanajura et al., 2014) was developed, and a version runs operationally in the Brazilian Navy Hydrographic Center (CHM) to produce short-range forecasts. RODAS runs with the HYbrid Coordinate Ocean Model (HYCOM) and is based in a multivariate Ensemble Optimal Interpolation (EnOI) scheme (Evensen, 2003; Oke et al., 2005, 2008, 2015). It was built considering the specificities of HYCOM following the recommendations of Xie and Zhu (2010). HYCOM is essentially formulated in terms of isopycnals, where only two of the three variables (density, temperature, and salinity) are independent. The problem of how to assimilate an observed T-S profile into HYCOM needs more attention than in other nonisopycnic models. Most of the observed profiles of temperature and salinity are measured in z-level or p-level coordinate, and their assimilation into HYCOM becomes nontrivial because its vertical coordinate system is time and space dependent (Thacker & Esenkov, 2002; Wang et al., 2017; Xie & Zhu, 2010).

According to Xie and Zhu (2010), the EnOI schemes can be divided into two different kinds that use different innovation vectors. The first is named straightforward scheme, where one might update the model variables at the same time from temperature and salinity observations. This scheme is also able to construct salinity by temperature error covariances itself, something that can be very advantageous if we think about the assimilation of T from XBTs. In this scheme, the model variables are carried to the observational space by the observation operator (see Equation 1).

The other kind is based on Thacker and Esenkov (2002) methodology and is called modified scheme, where the observations are projected into the model space. As a first step, potential temperature and salinity profiles promote the computation of pseudo-observed model layer thickness (dpobs). The dpobs is then assimilated to correct the real model layer thickness along with model velocity. After that, T (or S) is assimilated to update the model layer T (or S), followed by diagnosing S (or T) from equation of seawater state below the ML. This final step aims to preserve the target densities of HYCOM isopycnal layers.

The observation operators (H in Equation 1) used in the modified schemes are linear. However, the use of straightforward schemes turns the observation operator to be complex and nonlinear. A nonlinear operator in a linear equation may impact negatively the corrections implied by the objective analyses (Mignac et al., 2015; Xie & Zhu, 2010). When comparing their results against climatology, TAO, and Argo independent data, Xie and Zhu (2010) showed that the modified schemes present significant improvement over the straightforward schemes. RODAS is based on a modified scheme. Thus, as a first step, the system needs the T-S pair simultaneously to compute the dpobs and then carries the assimilation of T-S profiles.

Considering the extensive amount of XBT observations mostly collected in the pre-Argo era, and in order to investigate if this data can still improve model outputs, this work has the main goal of proposing a new method for estimating salinity when only temperature data are available in the Metarea V (36°S to 7°N, west of 20°W until Brazilian coast). To evaluate the impact of the developed method in a data assimilation framework, experiments have been performed with the HYCOM + RODAS system utilizing a version of EnOI modified scheme. In one of the experiments, the temperature from XBTs with estimated salinity was assimilated during 1 year.

The layout of the paper is organized as follows: In Section 2 the treatment and preparation of the observed data, the SS generation, and validation method are described. Section 3 shows the model configuration, the assimilation scheme, results, and discussions about the data assimilation experiments. Section 4 presents some conclusions and plans for near future works.

2 Estimating Salinity in Metarea V

2.1 Data Selection and Quality Control

To estimate salinity associated with XBTs in Metarea V, two approaches were initially considered: one composed only by interpolated climatological salinity and another based on polynomial regression similar to that found in Marrero-Díaz et al. (2001). The monthly climatological data set was acquired from the World Ocean Atlas 13 (WOA13) in a 0.25° × 0.25° grid (available at https://www.nodc.noaa.gov/cgi-bin/OC5/woa13/woa13.pl?parameter=t). To obtain analytic relationships between salinity and temperature, we searched for conductivity-temperature-depth (CTD) and profiling floats (PFLs) covering the whole study area, which included different programs, seasons, and years. All casts found were considered. They were obtained from the American National Oceanographic Data Center (NODC) program, World Ocean Database 13 (WOD 13) (available at https://www.nodc.noaa.gov/OC5/WOD13/data13geo.html).

The total extension of the area and the large amount of data allowed us to work with zones close or equal to squares with 10° of latitude and longitude. This choice was supported mainly because this is the size of the areas defined in WOD. Enough data have been found to deal with much larger areas. But, if the area was very large, the polynomial regression would not be representative. In order to obtain relevant estimates for such a large area, the study region was divided into 14 subregions (Figure 1a).

Details are in the caption following the image
(a) Delimitation of Metarea V in 14 subregions and spatial distributions of the 14,433 profiles that composed the elaboration set (2/3 of the total, red dots) and the 7,210 profiles of the verification set (1/3 of the total, blue dots). (b) Spatial distributions of the 2,329 Argo profilers (black dots) and the 701 XBTs (little red diamonds) utilized in the assimilation experiments. The transect around 27–42°W, 19–23°S is called AX97 by NOAA. Gray line responds for the 200 m isobath. Both figures show the HYCOM 1/12 domain.

After acquisition of all CTDs and PFLs data available in the domain, it was found out that for most zones, there were often more temperature profiles than salinity profiles. Moreover, for a single CTD cast, it was usual to perceive that there were isolated points of T without corresponding S. For the temperature (salinity), 7,159 (7,134) CTD profiles were found with 7,907,994 (7,836,871) data points and 34,193 (32,002) PFLs with 5,497,167 (5,290,973) data points. Thus, the total number of stations found for the entire study region was 41,352 (39,136) for the temperature (salinity) with 13,405,121 (13,127,844) data points, which resulted in 277,277 T points without the corresponding S. Initially, all stations were going to be used to produce SS, but after preliminary investigations of T-S diagrams and vertical profiles (graphs not shown), it was noticed the presence of many oscillations and discrepancies that did not correspond to the T and S values that were expected for each region. Thus, a strict quality control (QC) was elaborated in order to prepare the data for the steps that were to be followed.

When estimating S as a function of T by regressive methods in sea water, it is necessary to obtain the formed pair by the predictor variable (T) and by the response variable (S) for each depth level. For this reason, the casts with more T measurements than S measurements were removed. As the next step, the stations that did not reach at least 100 m depth were eliminated to avoid the influence of neritic waters (Caspel et al., 2010; Siedler & Stramma, 1983). The T-S fluctuations due to the seasonality of ML alone do not favor regression estimates, and the profiles that did not reach 100 m were almost always incomplete, with gaps or inconsistent.

It was established that profiles with 15 data points or less would be excluded, along with duplicated profiles and those that began at depths greater than 100 m. The duplicity of the profiles may produce biases on the estimates. Incomplete profiles, that is, the ones with gaps in any place of your vertical structure, do not contribute with information about the ML and can present error at some point. Therefore, it was better to avoid them completely (Thacker, 2006). The stations with gaps were removed. Profiles that showed salinity outside the range [34:37.5] g/kg were also excluded (Caspel et al., 2010; Pearce, 1981).

The aim of this study, besides generating synthetic salinities for the Metarea V, was to extract relevant information from XBTs data. Knowing that these probes rarely reach depths greater than 700 m (Hansen & Thacker, 1999; Marrero-Díaz et al., 2001, 2006), a truncation at the limit depth of 750 m for all profiles was imposed (Machín et al., 2010).

In order to eliminate the presence of remaining outliers, any salinity outside the range established by M(z) +/− 2sd(z), where M(z) is the mean and sd(z) is the standard deviation at depth level z, was excluded from the analysis (Emery & Dewar, 1982; Pearce, 1981). A 5-point moving average was also applied. Profiles that showed density inversions were considered unfit for its use in the estimates and were eliminated. The remaining data were interpolated to 5 m depth intervals. After the application of the QC, the total number of T-S profiles ready to be used was 21,643 with 3,150,721 data points, in which all the T profiles had a corresponding S.

2.2 Description of SS Method

Following the methodology proposed by Hansen and Thacker (1999), the remaining set was randomly divided into two groups, the elaboration set (ES) and the verification set (VS) (Figure 1a). The ES consisted of 2/3 of the number of stations that passed through all test steps and were used to fit polynomial regression models by least squares method. They were used to find the coefficients of functions with degrees 3 (P3), 5 (P5), and 7 (P7). The VS was formed by the remaining 1/3 of the quality-controlled profiles that did not participate in any estimation. They were used only for comparison and subsequent verification of the skill of the method.

The three polynomials showed very similar averaged results to each other, but, in general, the highest degrees presented the best estimates. It was possible to affirm that the quality improvement was much greater from P3 to P5 than from P5 to P7 (graphs not shown). This behavior was already expected, since the structure of the T-S diagrams of some subregions has relatively high degree of complexity having its representation hampered by functions of lower orders. Higher degrees tend to fit the data used to generate the model but may also generate unrealistic oscillations between the analyzed points. Also, higher orders present higher computational costs. Therefore, considering these aspects, P5 was chosen to represent the polynomial estimates. Its formula can be represented as
urn:x-wiley:21699275:media:jgrc24141:jgrc24141-math-0001(1)
where z is depth, SSz is the SS at depth z, Tz is temperature at depth z, and bi, i = 0,5 represents the P5 coefficients.

Table 1 summarizes P5 coefficients, RMSDs with respect to the VS, adjusted coefficient of determination, R2a, and the number of observations utilized to perform the calculations. They can be used with any temperature data to generate reliable predictions from surface until 750 m.

Table 1. Coefficients of the 5° Polynomials for Each Subregion
Zone b0 b1 b2 b3.101 b4.103 b5.104 RMSD R2a Obs
5,002 34.100 0.116827 −0.01644926 0.019875805 −0.08075008 0.010277771 0.0772 0.9828 210,865
5,003 33.905 0.186080 −0.02497776 0.023253448 −0.07867560 0.008109808 0.0987 0.9871 87,944
5,102 34.478 −0.103489 0.02828673 −0.022883192 0.10471070 −0.018048038 0.0635 0.9954 214,980
5,103 34.699 −0.212296 0.04576705 −0.035208972 0.14284778 −0.022067078 0.0634 0.9957 108,268
5,202 34.550 −0.155208 0.03370932 −0.023088047 0.08907071 −0.013847592 0.0885 0.9857 225,391
5,203 34.686 −0.240545 0.05028849 −0.036985832 0.14184242 −0.021217176 0.0924 0.9859 207,377
5,204 34.167 −0.035585 0.01763109 −0.011820725 0.05186683 −0.009389615 0.0884 0.9859 89,548
5,302 33.429 0.336719 −0.05847579 0.062187013 −0.27849958 0.043279519 0.0662 0.9804 157,702
5,303 32.820 0.617535 −0.10653269 0.099934146 −0.41260644 0.060691474 0.0687 0.9819 208,917
5,304 32.463 0.743981 −0.12136489 0.104851347 −0.40101753 0.055229803 0.0823 0.9825 145,371
5,305 32.868 0.544830 −0.08310102 0.070816059 −0.26159567 0.034204566 0.1497 0.9545 11,608
7,002 34.901 −0.223248 0.03990469 −0.022509449 0.06253219 −0.007780102 0.0786 0.9656 227,973
7,003 34.905 −0.212459 0.03678389 −0.020062488 0.05603383 −0.007172574 0.0769 0.9739 144,702
7,004 34.514 −0.017337 0.00081582 0.009215878 −0.04711367 0.005890336 0.0695 0.9876 60,655
  • Note. It is also presented the RMSD (g/kg), adjusted coefficient of determination, and the number of observations used to perform the calculations.

The coefficients obtained here have some advantages if compared with other works: (i) They cover great part of the Southwest Atlantic, (ii) they have very low cost for implementation, (iii) they are atemporal, and (iv) they can be applied in any kind of scheme being used for different purposes, favoring and enriching the scientific community with new relevant (synthetic) data.

2.3 Validation Method

The validation of the estimates was made utilizing the profiles of the VS. After obtaining the 5° polynomial coefficients, we used VS temperature data to calculate SS for each depth and each subregion according to 1. The true salinity of each profile was withheld to compute error statistics and accuracy of the SS estimates. The RMSD calculations were performed according to
urn:x-wiley:21699275:media:jgrc24141:jgrc24141-math-0002(2)
Here, SZi is the true salinity of the VS for each sample i, at each depth level z. 𝑁 is the total number of samples for each depth z, and SSZi is synthetic salinity.

Figure 2 shows the T-S diagram and the best fit curve generated by each P5 for the VS of each zone. In the 14 subregions, the presence of three water masses—Tropical Water (TW), South Atlantic Central Water (SACW), and (beginning of) Antarctic Intermediate Water (AAIW)—is perceptible.

Details are in the caption following the image
T-S diagram (red) and best fit polynomial curve (green) for each of the 14 subregions from surface until the depth of 750 m using the verification set data.

The ML, with depth varying from 80 to 150 m depending on latitude, is composed mainly of TW and exhibits a wide variability associated with seasonality. Below this layer, with T lower than 20°C, we find SACW with a very precise and monotonic T-S relationship (Stramma & England, 1999). This almost linear relation could favor the representation of the salinities of SACW by the 5° polynomials. In the lower portion of the T-S diagrams, a minimum of salinity marks the presence of the AAIW. It departs from approximately 27°S toward north and is transported by the Intermediate Western Boundary Current (Boebel et al., 1999; Legeais et al., 2013). The low surface salinity of Subregions 5,002, 5,003, 7,002, 7,003, and 7,004 indicates that these regions were under the influence of the Intertropical Convergence Zone (ITCZ), which has a great precipitation seasonal-to-interannual variability (Byrne et al., 2018).

Climatological salinity profiles have been widely used for initialization of numerical models since the last decade (Thacker & Esenkov, 2002). Here, monthly climatological salinity from WOA13 was spatially interpolated to the data in the VS. The results were compared with SS derived from best fit curve. Figure 3 shows the RMSD of both estimates with respect to true salinity of the observations. Although seasonal estimates could bring some benefits to the salinity predictions, the present work found insignificant additions, and they were not considered in the discussions.

Details are in the caption following the image
Vertical distribution of RMSD of salinity estimation obtained by P5 (continuous red) and WOA13 (dashed blue) with respect to verification set data from each subregion. The All nomenclature represents the average for all 14 subregions. The vertically averaged RMSD value for each profile is also displayed in the right side of the profiles.

As in the T-S diagrams of Figure 2, the similarity among the subregion curves arranged in the same zonal strip (5,002 and 5,003; 5,102 and 5,103; 5,202, 5,203, and 5,204; 5,302, 5,303, 5,304, and 5,305; and 7,002, 7,003, and 7,004) is remarkable. This fact suggests that it would be possible to apply a single polynomial for all subregions of same latitude interval, reducing complexity without entailing significant losses.

Below the depth of 150–200 m, both estimates present very small errors. P5 is stable and can overcome WOA estimates for most of this depth range in all subregions. However, there are increasingly smaller errors for WOA with increasing depth. In contrast, in some subregions (e.g., 5,202, 5,204, and 5,305), it is seen that the RMSD values of P5 tend to increase by a small amount when it is very close to the limit depth of 750 m. This causes the RMSD curves of WOA to overlap the P5 curves (e.g., 5,102 and 5,103), sometimes with slightly smaller errors (e.g., 5,202 and 7,002). This fact becomes more evident when investigating the average RMSD profile of all subregions. In the 750 m depth, there is influence of the AAIW (Marrero-Díaz et al., 2001). This mass presents positive increments in S with almost constant T with increasing depth, culminating in the North Atlantic Deep Water. The analytical relationship between T and S can only be described successfully in places where temperature varies along with salinity. In this portion of the water column, this assertion is violated, and the best fit curve tends to present larger errors with increasing depth. Marrero-Díaz et al. (2001) and Caspel et al. (2010) found similar results for the Atlantic Ocean. Also, below 750 m the variations of T and S are usually very small, making the climatological set very suitable to represent these regions.

Both P5 and WOA presented difficulties in representing the portion of the ocean between the surface and ML depth near 150 m. This region of the water column attains the largest errors of both estimates in all subregions. At the surface, where the variability is high and salinity does not correlate well with temperature, the regression models essentially estimate the salinity by the averaged ES, and the errors reflect the variability of the data around this estimate (Thacker, 2006). Since the mean does not represent very well these highly variable zones, the residuals in these regions tend to be larger than in others. Hansen and Thacker (1999) had already pointed to this possibility, since this zone presents great seasonal variability in response to interactions with the atmosphere. The little amount of available data on the surface and the presence of the seasonal ML in some regions can also potentiate the errors. However, for this same depth interval, it is worth noting that WOA estimates have surprisingly shown lower RMSDs for at least 11 distinct zones. The three zones that contradict this fact are those that present the largest hydrodynamic and thermohaline variability, Zones 7,004, 5,204, and 5,305.

Entirely in the Northern Hemisphere, the Subregion 7,004 has peculiar oceanographic characteristics. The North Brazil Current (NBC) is present in the region and exports waters from the south to the north integrating part of the Atlantic meridional overturning circulation. As it advances along the slope, the NBC can retroflect feeding the currents that flow to the east or can continue its course northwestward. This retroflexion is more intense in the austral summer and fall months and may disappear in the spring (Johns et al., 1998). In this highly energetic region, there is the formation of eddies that are released due to the changes in the direction of the current (Jochum et al., 2004). Thus, the NBC can influence the Amazon River plume pattern, an important process in the region. The plume reduces the gradients increasing or decreasing buoyancy and, consequently, the barotropic currents. The spread of the plume alters the vertical stratification of the water column in much of the equatorial zone, resulting in the formation of a halocline that can reach up to 50 m depth and can change ocean-to-atmosphere balance (Hu et al., 2004). In addition, we have the presence of the ITCZ that oscillates seasonally moving south in the austral summer due to the marked presence of trade winds from northeast. From this strong convection region, there is the formation of zones with great cloud cover and high precipitation rates on the equatorial zone.

Subregion 5,204 is characterized by strong mesoscale activity in the form of meanders and eddies (Campos et al., 1995; Silveira et al., 2006). According to Campos et al. (1995), the BC flows southward bordering the slope with practically constant depth. In the region of Cape São Tomé and Cape Frio, with the abrupt change in the direction of the coastline, by inertia, the BC continues to move to deeper regions. By conservation of potential vorticity, the increase in depth generates stretching of the column and increase in negative relative vorticity, leading to cyclonic circulation and flux toward the shallower zones. When reaching such zones, the decrease in depth causes compression of fluid column and positive relative vorticity, with anticyclonic rotation. This change in the vorticity causes it to return to greater depths. Thereafter, the flow follows as a topographic Rossby wave, which ends up resulting in a series of meanders and mesoscale eddies. This fact and the baroclinic instability generated by the shear between BC and IWBC have great impact in local dynamics.

The recurrent thermohaline structure of the Brazil-Malvinas Confluence near Zone 5,305 is marked by a complicated matrix of water masses and an intense temperature and salinity gradient. The ranges can vary from 7°C to 18°C and from 33.6 to 36.0 psu, only at the surface. On the western boundary along the front formed in the confluence zone, there is an intrusion of waters with very low salinity reminiscent from plume of the La Plata River and from plume of the Patos Lagoon, which ends up being advected to the north by the Malvinas Current. This range of coastal water presents extremely low salinity and can significantly suppress ML thickness especially in winter (Gordon, 1989). The absence of ML may have helped the estimation of P5 that presented a vertically averaged RMSD of 0.12 g/kg, which accounts only for 29% of the RMSD of WOA estimates (0.41 g/kg).

The profiles of RMSD of the three aforementioned zones (7,004, 5,204, and 5,305) demonstrate the feasibility of the polynomial estimates when compared to climatology. The synthetic estimates tend to predict better in places where the local dynamics is complicated and the thermohaline structure of the water column presents high spatiotemporal variability. Indeed, for estimating salinity, the best fit curves have shown themselves more suitable than climatology, once the vertically averaged RMSD of P5, equals to 0.06 g/kg (All panel), was 45% smaller than the averaged RMSD of WOA equals to 0.11 g/kg.

2.4 The Hybrid Approach

In Figure 3, the All profile does not represent very well each specific subregion, as the discrepancy with respect to the salinity estimates of Zone 5,305 are very high. The averaged curves of the All profile have similar magnitude, even with the WOA estimates being lower at the upper layers in at least 11 subregions.

To investigate the total contributions of the errors along the water column, the curves of RMSD of Figure 3 were integrated from the surface to the depth of 750 m at regular intervals of 5 m (Figure 4). Although the vertically averaged errors of the integrated P5 profiles are always smaller than the WOA estimates, as previously stated, the climatological curves in most of the regions presented smaller errors until a certain depth. The depth in which P5 becomes more relevant than climatology presenting smaller integrated RMSDs was denominated permutation point (PP). This is also shown in Figure 4.

Details are in the caption following the image
Vertically integrated RMSD of S from 0 to 750 m considering the profiles shown in Figure 3. Values for P5 are shown in solid red, while values for WOA are represented by dashed blue. All indicates the averaged values for all subregions, and PP is the permutation point for each subregion.

The PP indicates the depth in which P5 should be utilized to estimate S instead of climatology for getting the best response in the SS estimates. The PP was set to zero in zones where P5 presented smaller integrated RMSDs for the whole water column. This occurred in zones of high variability where climatology did not provide an adequate representation. The complementary nature of the estimates, with WOA being better in the first meters and the P5 showing better responses below the ML and downward, suggests that their conjoint use could be advantageous. Thus, a hybrid approach was proposed in which climatology is used in the top layers until PP depth and P5 are used below it. This approach was utilized to estimate the salinity that will accompany the T from XBTs in the EnOI scheme of the HYCOM + RODAS system.

Figure 5 shows the vertical profiles of the salinity estimated only by P5 (left) and by P5 plus WOA (right) for Region 5,002. It is clear that the hybrid approach obtains the best salinity estimates, representing very well the first 145 m with climatology and below with P5. The use of climatology up to PP allows the representation of salinities above 36.5 g/kg, something previously impossible for P5 alone. When compared to WOA, the hybrid approach is able to capture greater variability and to present smaller errors with increasing depth. The method developed encompasses the best that each estimate has to offer and has already been used successfully by Santana et al. (2020).

Details are in the caption following the image
Observed salinity (black), estimated salinity by P5 (left, red) and hybrid method (right). In right panel, WOA estimated the salinity up to 145 m (blue), and below that, salinity was estimated through P5 (red).

3 Model Configuration and Data Assimilation

In order to investigate the effectiveness of the developed SS in a preoperational context and to further investigate whether the assimilation of XBTs would bring some realistic impact to RODAS, data assimilation experiments were performed using HYCOM along with a version of modified EnOI scheme (Mignac et al., 2015; Oke et al., 2008, 2015; Tanajura et al., 2014, 2020).

3.1 Model Initial Settings and Forcings

HYCOM's hybrid vertical coordinate scheme is well described in Bleck (2002). More details about HYCOM general settings can also be found in Chassignet et al. (2003) and Halliwell (2004). The model configuration used here has approximately 0.08° of horizontal resolution (HYCOM 1/12), and it was nested into HYCOM configured with 0.25° of horizontal resolution, both with 21 vertical layers. The lower-resolution configuration covers the region from 78°S to 55°N, and from 100°W to 20°E. This larger-scale grid was employed in a recent Atlantic study made by Tanajura et al. (2020) and Mignac et al. (2015). The HYCOM 1/12 presents 601 by 733 grid points in the zonal and meridional directions, respectively. The physical domain extends from 45°S to 10°N and from 18°W until the South American coast. The top eight layers have much lower target density values to ensure their treatment as z coordinates and, consequently, a good resolution in the ML. The bathymetry was interpolated from ETOPO 2.

The model was forced each 6 hr with the atmospheric reanalysis fields of the National Centers for Environmental Prediction (NCEP/NOAA) Climate Forecast System Reanalysis (CFSR), with resolution of 1/4°. The model was forced with 10 m wind, temperature, and air mixing ratio at 2 m, precipitation, short wave radiation flux, and long wave radiation flux. HYCOM 1/12 numerical simulations were integrated from 1 January 2012 until 31 December 2012.

3.2 Observations

T-S profiles from Argo delayed-mode and XBT were used in the assimilation. In situ data from 2,329 Argo profilers were found in the southwest Atlantic in 2012 (Figure 1b). On the other hand, only 701 temperature profiles of the XBTs were found in the region. The XBT T-S pair was composed with SS obtained through the hybrid method described in section 2.4.

Sea surface temperature (SST) from the Operational Sea Surface Temperature and Sea Ice Analysis (OSTIA) of the UK MetOffice with 1/20° resolution (http://ghrsst-pp.metoffice.com/data/OSTIA/) (Donlon et al., 2012) were used in the data assimilation experiments. SST data were only assimilated in grid points surrounded by water and with depth greater than 30 m. The delayed L4 version of the data and their associated SST errors were used during assimilation.

Sea level anomaly (SLA) maps with 1/4° resolution from Validation et Interpretation dês données des Satellites Océanographique (AVISO) (https://www.aviso.altimetry.fr/en/data/products/sea-surface-height-products.html) (Ducet et al., 2000) were also assimilated in the present work. The SLA data are calculated using the Mean Dynamic Topography (MDT) data composed by data from 1993 to 2012. This mean field contains data from altimeters available in different periods, which includes Envisat, Jason-1, Jason-2, and Cryosat. The delayed time product was chosen due to its improved treatment and QC applied in the gridded field.

3.3 The Modified EnOI Scheme

RODAS employs a modified EnOI scheme (Evensen, 2003). Following Tanajura et al. (2014, 2020), the analysis equations of the EnOI implemented by REMO can be written as
urn:x-wiley:21699275:media:jgrc24141:jgrc24141-math-0003(3)
urn:x-wiley:21699275:media:jgrc24141:jgrc24141-math-0004(4)
where w is the state vector and superscripts a, b, and o, are mean analysis, background, and observation, respectively. H is the observation operator, and K is the gain matrix. C is a localization operator, R is the observation error covariance matrix, and B is the background error covariance matrix. The notation C ∘ B denotes the Schur product between C and B.
EnOI is a simplified form of the Ensemble Kalman Filter (EnKF) and uses an ensemble of model states from a previous free run to estimate the background error covariance matrix. In this work the ensemble used 126 members belonging to an interval of 6 years (2008–2013). Twenty-one members were selected for each year with an interval of 3 days composing a 60-day window centered in the analysis day. This running mean assumes that background errors are equivalent to shorter time-scale variabilities. The ensemble anomalies were used to compute the B matrix. In this paper, B was estimated by
urn:x-wiley:21699275:media:jgrc24141:jgrc24141-math-0005(5)
where n is the number of ensemble members and A is the ensemble of model anomalies defined by
urn:x-wiley:21699275:media:jgrc24141:jgrc24141-math-0006(6)
In 5, α ∈ (0, 1] is a scalar used to inhibit or potentiate the magnitudes of model covariances for a particular application, and urn:x-wiley:21699275:media:jgrc24141:jgrc24141-math-0007 is the anomaly of ensemble member i with respect to the ensemble mean. Here, α was set as 0.3.

3.3.1 Calculation of the Innovation Vector

For the calculation of the innovation vector of the profiles data, the methodology of this work was based on the approach proposed by Thacker and Esenkov (2002), which was later successfully applied by Xie and Zhu (2010), Tanajura et al. (2014), Mignac et al. (2015), and Tanajura et al. (2020). This procedure is performed by creating a quantity defined as synthetic observational layer thickness, or dpobs. The T-S pair of the Argo profilers and the temperature profiles of the XBTs accompanied by the previously generated SS were used to calculate potential temperature and potential density profiles. Then, the potential density profiles were employed to estimate the depths of the interfaces of the model layers and dpobs, considering the model potential density vertical coordinates. Calculation of dpobs is only possible when the T-S pair is available. The dpobs associated with each T-S pair was assimilated in a multivariate way with model state vector being composed by the model layer thicknesses and by zonal and meridional components of velocity. In a subsequent step, T and S were assimilated in univariate way for ML. Below ML, the salinity from Argo was assimilated, and temperature was diagnosed by the state equation. On the other hand, when assimilating XBT data, temperature was assimilated, and salinity was diagnosed. In an isopycnic layer, assimilation of T or S is necessary to keep the model target densities constant. So, to compute the calculation of the innovations in a modified EnOI scheme, the simultaneous presence of the T-S pair is indispensable. The innovations of SST and SLA were computed using model instantaneous fields of the variable of interest. The model SLA was calculated by subtracting the model mean sea surface height (SSH) from the model SSH at the assimilation time. The mean SSH was calculated considering daily outputs produced by 8 years (2006–2013) of the model free run.

3.3.2 Localization and Observational Errors

The localization operator in 4, C, is applied to reduce the influence of the spurious covariances of B. A radius of localization was applied around the assimilated grid point. The covariances between grid points were weighted based on a practically Gaussian distribution. The radius of localization was defined as 150 km for the assimilation of the Argo profilers and as 50 km for the assimilation of the XBTs transects. This choice is supported by the fact that, differently from the Argo, XBT transects have high density, with probes being usually sampled with less than 30 km resolution. This approach is supposed to allow the maintenance of T-S high variability present in western boundary current regions.

Vertical localization was also applied in dpobs considering that the distance between the layers is measured by the stratification of the water column instead of the Euclidean distance. The chosen scale factor was 0.5 kg/m3, based on Mignac et al. (2015). A horizontal radius of influence of the observations was also applied with the same magnitude of the radius of localization. The localization equations were taken from Xie and Zhu (2010).

The standard deviations of the observational errors of T and of S from Argo are the same as that used in Xie and Zhu (2010). The T errors from XBTs increase downward (Kizu & Hanawa, 2002) and were computed with an increment of 0.1°C for each 100 m from surface until bottom:
urn:x-wiley:21699275:media:jgrc24141:jgrc24141-math-0008(7)
The errors for the SS were estimated through Equation 2 for each depth level and for each subregion. Thus, the RMSD of SS and the T and S errors composed the R diagonal matrix.

The SST and SLA errors were provided together with the data by the institutions that are responsible for these products. The 2-D fields of the errors increase with the observed SLA and SST large-scale and mesoscale variability, and they are negatively correlated with the presence of remote observations (Donlon et al., 2012; Ducet et al., 2000).

3.4 Assimilation Experiments

Two 1-year integrations, from 1 January 2012 until 31 December 2012, were carried out assimilating T-S profiles, SLA, and SST. They were initialized from a model free run. In the first experiment, the EnOI assimilated SLA, SST, and in situ data of T and S from 2,329 Argo profilers found in the southwest Atlantic (Figure 1b). This experiment was called RODAS. In the second experiment, in addition to all observations employed in RODAS, 701 temperature profiles of the XBTs were also assimilated with SS obtained through the hybrid method described in section 2.4. This was called RODAS_XBT. The assimilation runs aim to evaluate the feasibility of the method developed for SS generation and the impact that the assimilation of XBTs profiles could bring to the HYCOM + RODAS system. In general, a huge impact was not expected, since only 701 probes were added in a time window of 1 year. However, localized impacts were expected in the regions where XBT data were available.

In both experiments, data were assimilated every 3 days. Regarding the T-S data, an observational window of 72 hr was employed, so that observations present on 2 days before analysis day were assimilated, but with increased errors as a function of the age of the data following Tanajura et al. (2020). In order to assess the impact of the assimilation, the results of a free run without assimilation and with the same configurations of the model used in the assimilation experiments were also considered. This control free run integration was referenced as FREE.

3.5 Comparison Between Experiments

Figure 6 presents the RMSD profiles of T and S of the analysis and the background state for the three experiments against all 701 XBTs that were assimilated. Both assimilation experiments reduce the error with respect to the FREE run for T as well as for S. For T, the assimilation of SST, SLA, and Argo (solid blue) has proved to be beneficial at the XBTs points. The vertically averaged RODAS error for T at the XBT points considering the top 700 m was 1.33°C. This is 29% smaller than the errors of the FREE run equal to 1.87°C). Since RODAS did not assimilate XBT probes, its analysis and background curves are pretty much the same, with analysis error (1.33°C) almost equal to the background error (1.36°C). The analysis curve of RODAS_XBT has the greatest vertically averaged RMSD reduction, attaining 0.68°C, against 1.33°C of RODAS_XBT background curve. Also, it is always smaller than any other RMSD curve at any depth up to 700 m. The correction achieved by the analysis of the RODAS_XBT run is strong all through the water column and is about 2°C smaller than the FREE run at 500 m. Below this depth, both the RODAS and RODAS_XBT background curves and the RODAS analysis curve tend to reduce their errors toward deeper layers. This trend in approximating the RODAS_XBT analysis curve can be possibly explained mainly by three factors: (i) The FREE run presents a natural reduction of its error between 450 and 600 m, thus impacting the ensemble anomalies and reducing the analysis increment; (ii) the XBT observational error (see Equation 7) increases with depth, giving more weight to the background state and decreasing the importance of assimilation; and (iii) there is a reduction on the number of XBT probes that reach deeper layers. The overlapping of both background curves is due to the spatial variability of XBTs probes that are updating, mostly, points in different transects on each assimilation cycle.

Details are in the caption following the image
Root mean square deviations against 701 XBTs for (top) temperature and (bottom) synthetic salinity. Dashed curves were used for the background and solid curves for the analysis. The FREE run is represented in black.

For salinity (bottom panel of Figure 6), the RMSD curves were calculated against SS directly generated by the hybrid method described in section 2.2. Although it is a synthetic quantity, the chart elucidates the impact of assimilation and, also, if the analysis increments are in the expected direction leading the model background toward observations. The increments are in the right direction, since the RODAS_XBT analysis curve surpassed all the other curves from surface until the end of the investigated water column. For depths greater than 500 m, the same pattern that was observed for T is also observed for S. The RODAS run showed the same vertically averaged value (0.26 g/kg) for the analysis and background curves, while the RODAS_XBT analysis curve presented some improvement (0.11 g/kg) against its background curve (0.26 g/kg). The FREE run exhibited the greatest RMSD value of 0.34 g/kg.

As mentioned above, it was not expected that only a 1-year assimilation of XBTs probes could be a game changer. The HYCOM + RODAS system already takes into account the Argo profilers to update the thermohaline state of the southwest portion of the Atlantic Ocean. The present experiments were also evaluated against the Argo T-S data on a daily basis considering the 24, 48, and 72 hr outputs of the model run after assimilation along the integration period. Assimilation was performed each 3 days. Therefore, the Argo data employed in the evaluation may be independent, since they would be assimilated only in the next assimilation cycle, taking the 72 hr hindcast of the previous cycle as background. In this evaluation, RMSD was computed using 2,329 Argo profilers. Figure 7 shows the combined error of the 24, 48, and 72 hr hindcasts after each assimilation. The RODAS run is able to correct the model state of the FREE run in the direction of the observations. If the assimilation of XBTs in RODAS_XBT did not produce any striking change in the configuration of the curves, it also did not disturb them. For T, RODAS_XBT showed slightly lower errors than RODAS for all depths. This suggests that the method developed here did not produce any outstanding inconsistency. As well for T as for S, the RMSD profile of RODAS_XBT is almost always smaller than RODAS, suggesting that the assimilation of XBTs was able to help the correction of the thermohaline state of the water column even at the Argo irregular positions.

Details are in the caption following the image
Root mean square deviations of 24, 48, and 72 hr hindcasts for T (top) and S (bottom) after each assimilation against 2,329 Argo observations present in Metarea V.

Annually averaged fields of T and S up to 800 m, SSH, SST, surface currents, temperature, salinity, and thickness of the ML were evaluated, but no major changes were observed between the experiments with assimilation. The XBTs are usually launched along the same transects few times along the year; therefore, most of the analyses were produced without XBT data. Thus, the averaged variables of RODAS and RODAS_XBT hardly differ from each other except in very specific places or in some specific analyses. The 701 XBTs assimilated in RODAS_XBT were distributed in 61 distinct days and participated in only 31 objective analyses. So, to assess the local impact that assimilation of XBTs could bring to HYCOM + RODAS, a closer investigation over the NOAA AX97 XBT line was performed.

Between 25 and 27 February 2012, 34 XBT profiles on AX97 line were collected between Cape São Tomé and Trindade Island (Figure 1b). The companion salinity values were estimated using the hybrid approach, and after that, geostrophic velocities between T-S profiles were computed by the dynamic method considering a nonmovement level of 700 m. The model T and S fields were interpolated to the location of the XBTs, and the geostrophic current was obtained by the same way used on the calculation of XBT currents. XBT data (Figure 8, top panel) reveal the presence of the BC core (over 0.45 m/s) at 40°W and other two southward current jets at 34.8°W and 31.7°W with maximum speed near the surface. These results are in good agreement with Evans and Signorini (1985), who showed that a BC trifurcation near Vitória-Trindade Ridge may occur east of 35°W. These results also corroborate with Pereira et al. (2014). These authors investigated numerical outputs of HYCOM and the Simple Ocean Data Assimilation (SODA) at 22°S and found maximum annual velocities of 0.54 ± 0.15 and 0.43 ± 0.11, respectively.

Details are in the caption following the image
Zonal section of meridional velocity component (red—positive shade; blue—negative shade) extracted on 27 February 2012 from AX97 XBT transect, FREE run, RODAS, and background state and analysis state of RODAS_XBT. The black dots represent the XBTs relative position.

The FREE run presented a wider and weaker BC with maximum speed of −0.35 m/s displaced to east in the upper 400 m in comparison to the pattern derived from the AX97 data. Although the FREE run has been able to capture a southward jet at 37.3°W—in good agreement with respect to the velocities derived from the XBTs—in general, this run showed much weaker velocities and sometimes with opposite direction.

Since no Argo data were available in this area during this assimilation step in the end of February 2012, RODAS and the RODAS_XBT background sections look like pretty much the same. These sections show a weaker but better positioned BC if compared with the FREE run. The better representation by RODAS and the RODAS_XBT background may be mostly due to the assimilation of Argo profilers in the previous assimilation cycles in the region. This hypothesis is corroborated by the fact that the BC cores on these runs are almost exactly the same and the assimilation of XBT in previous assimilation cycles could possibly change the velocity pattern in RODAS_XBT background section. Even having only few Argo profilers in this area throughout the year, their assimilation proved to be useful in setting the right position of BC.

However, they were not able to simulate the appropriate magnitude of this western boundary current. The remaining section does not introduce any other striking feature due to the absence of assimilation of in situ data.

The analysis section of RODAS_XBT experiment showed the BC well-placed and also with increased speed. The values are closer to those estimated by the geostrophic velocities from the XBTs. The assimilation of XBT data increased the magnitude of the BC core velocity to values greater than 0.33 m/s. Using surface drifter data in a transect at 22.75°S, Oliveira et al. (2009) found the BC velocity to be 0.39 ± 0.23 m/s. This velocity is quite similar to the one produced here, considering the interval of standard deviation and the fact that a comparison was done between mean velocity and one coming only from one objective analysis. Moreover, the RODAS_XBT run was the only one able to capture the real sense of great part of jets observed in this complex zone. The north and south directed patterns at approximately 39°W, 36.2°W, and between 35°W and 34°W were well simulated and showed much more agreement with XBT currents than any other run. When the BC faces the Vitória-Trindade Ridge, the circulation develops a complex motion, and looking at circulation east of 33°W, it is possible to note the good agreement between the jets of XBT currents and the RODAS_XBT analysis. If the jets are a little weaker than the XBT velocities, their main features are essentially there. It is remarkable that this complex feature with high spatial variability has been well captured by the assimilation of XBT data. It should not be forgotten that only XBT data collected in the past 2 days were additionally assimilated in the RODAS_XBT analysis in comparison to the RODAS_XBT background. These facts point to the benefits of the assimilation of XBT data with companion SS profiles in addition of the Argo, SLA, and SST data. It can improve the representation of the BC current and associated meanders and amplify the spectrum of applications of the HYCOM + RODAS in oceanography and environment. It is expected that similar results could be obtained in other regions, such as those, for instance, under the influence of the Gulf Stream.

4 Conclusions

A new hybrid approach for estimating salinity as a function of temperature was proposed by the present work with focus in the Southwest Atlantic, namely, the so-called Metarea V. Both climatological data and SS obtained through 5° polynomials were used together to produce the best possible estimate of vertical profiles of salinity to accompany temperature data that lack its salinity pair. This strategy to combine climatological data with SS data was based on search for the smallest RMSDs along the water column. The results obtained here agree in part with previous studies, which show that a mean T-S curve can capture most of the halosteric variability of the ocean. When comparing with early works (Goes et al., 2018; Hansen & Thacker, 1999; Marrero-Díaz et al., 2001, 2006), this scheme has the main advantage of being simple, easy to be implemented and have very low computational cost. The coefficients in Table 1 cover a great part of the Southwest Atlantic. They are atemporal and suitable to be used for any depth up to 750 m. They can be applied in a variety of purposes, such as calculations of dynamic heights and geostrophic currents or feeding schemes of data assimilation that require the T-S pair to be realized.

The coefficients obtained here were used to generate SS profiles to accompany the temperature from XBT probes. These profiles were tested in data assimilation experiments for the year of 2012. Due to the low sampling of XBTs in the region of interest, the general impacts were limited when considering averaged fields in space and time. However, local and instantaneous impacts were substantial, as presented by the case study along the AX97 XBT transect. It has been proven that the hybrid approach proposed here is feasible and that the XBT lines may bring gains for forecasting and hindcasting systems when assimilated together with other ocean variables. One of the most important findings in this study reveals that the assimilation of XBT data with the proposed SS improved the representation of the Brazil Current (BC) by the HYCOM + RODAS system, both in direction and magnitude when one single objective analysis was investigated. The corrections imposed were physically consistent according to recent works (Lima et al., 2016; Pereira et al., 2014), and it is expected that, if data were continuously available in longer assimilative runs, an improved representation of the BC mean and variability could be produced.

The results presented here may have a positive impact on the studies of ocean circulation, in particular, on those that use historical or currents estimated by XBT profiles, and applications on ocean forecast in which salinity plays a critical role. The developed observation system evaluation experiment points to the importance and maintenance of XBT lines in the assessment and improvement of the BC and, therefore, of the South Atlantic circulation. The XBT transects remain relevant even today, where autonomous profilers account for a large part of the observed T-S profiles.

In the near future, other predictors and other approaches to generate SS may be used with in situ measurements to improve the estimates offered here, including information about seasonal, interannual, or decadal variability of salinity in the upper ocean, as the seasonality term of Goes et al. (2018). Moreover, many studies have merged data coming from satellites such as sea surface height or sea surface salinity with observations either at regional or global scales using different techniques such as multivariate regressions or gravest empirical modes. Products coming from Aquarius, Soil Moisture, and Ocean Salinity (SMOS) and Surface Water and Ocean Topography (SWOT) were already used in other works with relative success and shall still be very useful in the future. These include the works of Swart et al. (2010), Meijers et al. (2011), and Yang et al. (2015) to name only a few. Spatial covariance functions using weights given from satellite altimetry can also favor the estimates.

Acknowledgments

This work was partially supported by PETROBRAS and the Brazilian oil regulatory agency ANP (Agência Nacional de Petróleo, Gás Natural e Biocombustíveis), within the special participation research project Oceanographic Modeling and Observation Network (REMO). The authors would like to acknowledge the fellowship Grant 446528/2014-5 provided by the Brazilian National Council for Research and Development (CNPq) and the crucial computational infrastructure provided by the MCTIC/FINEP/CT-Infra 01/2013 Project 0761/13.

    Data Availability Statement

    All the data used in this work are referenced in text or can be found online (at https://rederemo.org/webdata/GEOFF/).