Fast and Accurate Computation of Vertical Modes

The vertical modes of linearized equations of motion are widely used by the oceanographic community in numerous theoretical and observational contexts. However, the standard approach for solving the generalized eigenvalue problem using second‐order finite difference matrices produces O(1) errors for all but the few lowest modes, and increasing resolution quickly becomes too slow as the computational complexity of eigenvalue algorithms increases as O(n3) . Existing methods are therefore inadequate for computing a full spectrum of internal waves, such as needed for initializing a numerical model with a full internal wave spectrum. Here we show that rewriting the eigenvalue problem in stretched coordinates and projecting onto Chebyshev polynomials results in substantially more accurate modes than finite differencing at a fraction of the computational cost. We also compute the surface quasigeostrophic modes using the same methods. All spectral and finite difference algorithms are made available in a suite of Matlab classes that have been validated against known analytical solutions in constant and exponential stratification.


Introduction
Vertical modes arise as part of the separable solution to both the internal wave problem and quasigeostrophic theory. The eigenvalue problem (EVP) is treated in many introductory physical oceanography textbooks, (e.g., Cushman-Roisin & Beckers, 2011;Gill, 1982), and the resulting vertical modes describe the vertical structure of the linear solutions for a given density profile. While there are an infinite number of bases that can be used to represent ocean currents and density anomalies satisfying certain boundary conditions, the vertical modes correspond to O(1) dynamical solutions of the equations of motion and are therefore both diagnostic and prognostic. For this reason, vertical modes are the standard basis with which to represent the vertical structure of ocean currents, and it would be hard to overstate their usefulness for describing and modeling the ocean.
There are two primary uses for the vertical modes: (1) projecting a given flow field onto the vertical modes to determine its spectrum (a forward transformation) or (2) creating a dynamically consistent linear flow field from a given spectrum (an inverse transformation). For example, projection onto the vertical modes (the forward transformation) was used to construct the Garrett-Munk internal wave spectrum (Munk, 1981;Polzin & Lvov, 2011), analyze regional and global ocean simulations (Buijsman et al., 2010;Kelly et al., 2012;Zilberman et al., 2009), interpret satellite altimetry observations (LaCasce & Wang, 2015;Zhao, 2016), and to describe the vertical structure of balanced flow in the ocean (Wortham & Wunsch, 2014;Wunsch, 1997). 1. A poorly defined mean density function. 2. Measurement noise and uncertainty in the density function and in the case of a forward transform the dynamical variables. 3. Aliasing error, due to the location of the grid points of the dynamical variables (relevant only for forward transform). 4. Interpolation error, due to the location of (and lack of) data points specifying the density function. 5. Numerical truncation error of the modes generated from the vertical EVP.
Whether performing a forward or inverse transformation, these sources of error will compound in some fashion to create error in either the resulting flow field or the resulting spectrum.
The first source of error, a poorly defined mean density function, can result from lack of data or often just uncertainty in what qualifies as a mean (e.g., questions of what time and length scales to average over or even whether averaging is the correct approach for a nonlinear system). Measurement noise and uncertainty in data is a topic unto itself, so the methods here proceed without concern for measurement noise when possible. The aliasing error arises when the grid points are placed such that higher modes project onto the lower modes. This source of error is relatively easily controlled by computing the condition number of the projection matrix, which provides a fairly precise cutoff for the number of resolvable modes. On the rare occasion that a density function can be specified analytically, the interpolation error can be minimized to numerical precision. However, in the usual case where the density is given on some grid with uneven spacing, the density function must be interpolated in between those grid points. Our work suggests that density interpolation error does not dominate the error for most cases. This manuscript is therefore largely concerned with the final source of errors-arising from the numerical representation of the modes in the vertical EVP.
The standard method for solving the vertical EVP is to discretize the problem and construct the matrices using second-order finite difference matrices (e.g., Cushman-Roisin & Beckers, 2011). However, this approach produces unacceptable errors for all but the very lowest modes in the simplest stratifications. This is problematic because numerical algorithms for solving EVPs scale as O(n 3 ), where n is the number of discretization points in the vertical, so small increases in resolution come at a large computational cost. Instead of using finite difference methods, Kelly (2016) solves the hydrostatic form of the EVP spectrally using Galerkin's method, at a fraction of the computational cost and with much higher accuracy. Dunphy (2009) uses a Chebyshev collocation method to solve the nonhydrostatic case with fixed frequency. Here, we extend the ideas of Kelly (2016) and Dunphy (2009) to the more general nonhydrostatic cases where the EVP must be solved for each frequency in the spectrum or each resolved wavenumber in a numerical model. Our approach solves the EVP spectrally with Chebyshev polynomials to produce high-quality vertical modes, even with relatively small n. The same techniques are then applied to solve for surface quasigeostrophic modes, used to describe the effect of density anomalies at the ocean boundaries.
To illustrate the differences in performance of these two methods, we consider the 10th vertical mode at two different frequencies in a realistic stratification profile. The left panel in Figure 1 shows an example stratification profile taken from the Sargasso Sea during the Latmix 2011 summer field experiments (Shcherbina et al., 2015). The 10th vertical mode is computed with 512 grid points using both second-order finite difference methods (blue) and spectral methods (purple). The center panel shows that, at hydrostatic frequencies, finite difference methods perform well. However, the right panel shows the 10th mode at a nonhydrostatic  (Shcherbina et al., 2015). The two vertical lines indicate a nearly hydrostatic frequency (dashed) and nonhydrostatic frequency (dotted). The 10th mode computed using finite difference methods (blue) and spectral methods (purple) is compared at the hydrostatic frequency (center panel) and nonhydrostatic frequency (right panel). Both methods use 512 evenly spaced grid points as input then interpolate to a high-resolution grid as output.
frequency (corresponding to the dashed line in the left panel) for which the finite difference method fails. Generally, we find that finite difference methods with N grid points will return about N∕10 good modes, although this is stratification and grid point location dependent. For many low-mode low-frequency studies then, finite difference methods are fully up to the task. However, for nonhydrostatic frequencies or high modes, the methods described below are more suitable.
This paper begins with a derivation of the two most relevant forms of the vertical EVP that arise from the linearized equations of motion in section 2. This provides the necessary context for orthogonality relations that form the basis of the normalization of the vertical modes, which in turn shows limitations of using certain vertical modes as a basis. Section 3 demonstrates some of the problems associated with finite differencing, while section 4 shows how these can be remedied using spectral methods. Details of the numerical implementation are described in section 5, and some of the other sources of error are examined in section 6. Section 7 discusses some best practices and potential pitfalls. Finally, the appendices include the exact analytical solutions for constant and exponential stratification that are employed for unit testing, the WKB (Wentzel-Kramers-Brillouin) approximated solution used for the adaptive grid method, and the class inheritance tree of the publicly available Matlab implementation of these methods.

Background
The linearized equations of motion for the fluid velocity u (x, , z, t), v(x, , z, t), w(x, , z, t), on the -plane are where p (x, , z, t) and (x, , z, t) are the perturbation pressure and density, respectively. These are defined such that the total pressure p tot (x, , z, t) = p(x, , z, t) + p 0 (z) and the total density tot (x, , z, t) = 0 +̄(z) + (x, , z, t), where z p 0 (z) = −ḡ(z). All variables in the equations of motion are functions of x, , z and t, except for̄which is strictly a function of z. The operator z is understood to reduce to d dz when applied to univariate functions. We use the usual definition of the buoyancy frequency N 2 (z) ≡ − g 0 z̄.
There are three linearly independent solutions to equations (1)-(5), assuming periodic horizontal boundary conditions and a flat bottom: two wave solutions and the geostrophic solution.

Wave Solutions
The positive frequency wave solution is given by where the functions F(z) and G(z) are the eigenfunctions with corresponding eigenvalue h, to be discussed in detail below. The frequency is determined through the dispersion relation, and the negative rotating wave solution is found by flipping the sign on the frequency,  → − . In this notation the phase angle of the wave is given by = tan −1 ( l k ) and K = √ k 2 + l 2 is the total horizontal wavenumber. The horizontal phase is given by ± = kx + l ± t. The eigenvalue h is referred to as the equivalent depth and is related to the wave group velocity, c g = √ gh. The value h can be replaced in favor of eigenfrequency using the dispersion relation, but equation (6) uses both in order to avoid singularities at K = 0 and for notational compactness. Applying this solution to equations (1)-(5) leads to two coupled equations for the vertical structure functions, which can be combined into various second-order EVPs (see section 2.3).

Geostrophic Solution
The geostrophic solution is given by where 0 = kx + l . The solution can also be written entirely in terms of stream function (x, , z) Satisfying the vertical momentum equation requires that N 2 G = −g z F, but unlike the wave solutions, geostrophic solutions already satisfy continuity. For a given wavenumber (k, l) the geostrophic solution is entirely specified by a vertical profile of any one of the variables, from which all others are immediately deduced. For example,̂(k, l, z) determines G(z), from which F(z) is determined by integration-the thermal wind balance. There is no preferred basis for the geostrophic solution.

Journal of Advances in Modeling Earth Systems
10.1029/2019MS001939 solution into three parts: two parts resulting from the density anomaly at the boundaries and one part from the remaining density anomaly in the interior. Following Lapeyre and Klein (2006), the idea is then to let where both the surface and bottom components of the flow are required to have no potential vorticity in the interior, but account for the density anomaly at the boundaries, for example, The resulting modes can be solved for a given wavenumber and are referred to as the surface quasi-geostrophic (SQG) modes. This methodology has been used to construct the interior velocity field from sea surface height (SSH) and temperature data Wang et al. (2013).
Smith and Vanneste (2013) formulate a new EVP that results in modes capable of capturing surface density anomalies for quasigeostrophic flows. Taking equation (9) from Smith and Vanneste (2013) and writing it in the present notation, we have where is a tunable parameter. Importantly, these modes remain orthogonal, unlike the combined set of SQG modes and interior modes described above.
An alternative to both the SQG and the Smith and Vanneste (2013) approaches is to use the internal wave eigenbasis constructed with the free surface boundary condition (detailed below) which also results in orthogonal modes capable of capturing nonzero density anomaly at the ocean surface.

Vertical EVP
The vertical EVP is formed using the two coupled equations from section 2.1 for the vertical structure functions, (N 2 − 2 )G = −g z F (vertical momentum) and F = h z G (continuity). In combination with the dispersion relation, one of the eigenfunctions can be eliminated, resulting in various single second-order EVPs for the vertical structure functions F or G. The two most practical EVPs to solve are for G(z) with constant , and for G(z) with constant K, The near-geostrophic EVP is found by combining the vertical momentum condition N 2 G = −g z F with the continuity condition F = h z G, which can be treated as the ( -constant EVP) with = 0. Note that this is equivalent to making the hydrostatic approximation (Kelly, 2016) and removes all dependence on frequency and wavenumber K.
For the cases considered here we take the lower boundary condition at z = −D to be either free slip, where w(−D) = 0, or no slip, where u(−D) = 0. These correspond to G(−D) = 0 and F(−D) = 0, respectively. These conditions can be seen as the limiting cases of sloped bottom topography (LaCasce, 2017). The surface boundary condition at z = 0 is taken to be either a rigid lid with w(0) = 0, G(0) = 0 or a free surface approximated as p(x, , 0) = 0 g (x, , 0), where ≡ −( z̄) −1 is the linear approximation of the isopycnal displacement. In terms of the vertical modes, the free surface boundary condition becomes h z G(0) = G(0). Finally, there are many cases where the density profile does not extend to the full depth of the ocean and no boundary conditions (beyond the EVP itself) should be added.

10.1029/2019MS001939
Solving either EVP yields a set of eigenvalues h that can be ordered such that h 1 > h 2 > h 3 > .. > h n , each with corresponding eigenfunction G . This means that solving the ( -constant EVP) results in wave solutions with the same frequency , but different wavenumbers K , and similarly solving the (K-constant EVP) results in wave solutions with the same wavenumber K and different frequencies . Although we do not implement this numerically, note that rearranging the (K-constant EVP) poses the EVP for fixed group velocity, gh, with eigenvalue K 2 .
The equations for the ( -constant EVP) and (K-constant EVP) are both Sturm-Liouville problems and share the property that their eigenmodes are orthogonal. Following the procedure in Kelly (2016), the eigenmodes found with the ( -constant EVP) satisfy and while the eigenmodes found with (K-constant EVP) satisfy and where and are unspecified constants that depend on the chosen normalization, as discussed below. It is important to note that these orthogonality conditions only apply for a particular choice of or K. For example, an eigenmode G (z, k 1 ) found using K = k 1 is not orthogonal to an eigenmode G (z, k 2 ) found using K = k 2 if k 1 ≠ k 2 .
The most significant difference between the two EVPs is that eigenmodes from the (K-constant EVP) often form a complete basis set for typical ocean stratification profiles, while the eigenmodes from the ( -constant EVP) do not. The (K-constant EVP) is a regular Sturm-Liouville problem when the weighting function w K (z) ≡ N 2 − 2 0 > 0 for all z, a condition typically met in the ocean. We note that although it is fair to say that stratification with N > 0 is typical of the world oceans, after examining 30,000 CTD profiles, Kunze (2017) finds that 10% of the data have N < 2 and a full 30% of the data suggest N < 2 within 380 m of the bottom. In contrast, the weighting function w (z) ≡ N 2 − 2 in the ( -constant EVP) switches sign at turning points z T , where N(z T ) = . Consequently, the norm of an arbitrary function defined on the domain [−D, 0] and satisfying the boundary conditions is not guaranteed to be positive using the norm implied by equation (13), a necessary condition for completeness. Intuitively, this can be seen in Figure 4, which shows that the high-frequency modes have no variance beyond the turning points and are therefore incapable of representing arbitrary functions on the domain.

Normalization
The amplitude of each vertical mode can be scaled by an arbitrary constant, so one must choose a normalization appropriate for the problem being considered. The four most common scenarios are setting the total energy, a horizontal velocity (U), a vertical velocity (W), and the SSH of a given wave.
To set the total energy of the internal wave solution in equation 6, we use the modes from the (K-constant EVP) and therefore use the norm implied by equation 15, where we have chosen = 1 as the normalization constant in order to keep the vertical modes unitless. Another reasonable choice is to take = N 2 0 D∕g, but using = 1 keeps the norm universal, rather than problem specific. Taking the total energy of the wave,

10.1029/2019MS001939
and then depth integrating and averaging over space and time produces a wave with energy P 2 ∕2 if we set Setting the maximum initial eastward velocity to U can be accomplished by imposing max F = 1 and A = U. The maximum vertical velocity W is set using the norm max G = 1 where A = W∕(Kh ), but is clearly singular for inertial waves which have no vertical velocity. The SSH is set using the pressure at the surface with F(0) = 1 and A = SSH · Kh . To set the total energy of the interior geostrophic solution in equation 9, we assume the solution uses the typical geostrophic modes from equation ( -constant EVP) with = 0 and therefore use the norm implied by equation 13, where we have taken = D∕h i . This produces a mode with energy P 2 if we let, Setting the maximum eastward velocity requires B = U 0 gl using max F = 1. The SSH is set using the pressure at the surface by setting F(0) = 1 and B = SSH.

The Problem With Finite Differencing
Computing the lowest vertical modes with finite differencing methods can be relatively fast and accurate when considering a single wavenumber or frequency. Although one can encounter problems with the higher modes, this can usually be ameliorated by increasing resolution. The primary issues with finite differencing arise when needing to compute many modes across a broad range of frequencies and wavenumbers-the two scenarios that motivated the present study. To compute a complete internal wave frequency spectrum requires solving the ( -constant EVP) at each resolved frequency between the Coriolis frequency and the maximum buoyancy frequency, roughly O(10 2 ) EVPs. This is especially challenging near the buoyancy frequency, where all oscillations occur in the narrow region where N(z) < . On the other hand, initializing a numerical model with an internal wave spectrum involves solving the (K-constant EVP) for each resolved wavenumber in the model, which easily requires O(10 4 ) computations or more.
A prerequisite to initializing a numerical model with a precise internal wave spectrum is that the modes must be computed for each unique horizontal wavenumber K resolved by the model using the (K-constant EVP). If the numerical model has (N x , N ) horizontal grid points, approximately N x N ∕2 unique EVPs must be solved (up to another factor of 2 can be eliminated with isotropic horizontal resolution). Unfortunately, eigenvalue algorithms scale as O(n 3 ) for n by n matrices. This means that initialization of an internal wave spectrum scales as O(N x N N 3 z ), and thus, with any reasonable vertical resolution, this will quickly become a rate limiting step to a model run. In practical terms, the computation time of these EVPs takes approximately 1, 10, and 100 s for N z of 512, 1,024, and 2,048, respectively, on consumer hardware from 2015.
The problem is further exacerbated by the poor performance of finite difference methods. To demonstrate, we compare different numerical methods against an analytical solution. Consider an exponential density profile-the canonical deep ocean stratification profile which has known analytical solutions for both the nonhydrostatic internal modes (Garrett & Munk, 1972), as well as the SQG modes (LaCasce, 2012). Using a numerical method to solve the (K-constant EVP), we can compute the relative error of the numerical solution to the analytical solution. We define the relative error as where (z i ) is the true solution evaluated at the grid points z i and i is the numerical approximation. Figure 2 shows the maximum relative error of the eigenmodes F, G and eigenvalue h found by solving the (K-constant EVP) using second-order finite difference methods for standard exponential stratification with 64 vertical grid points (blue). Details of the numerical implementation of the analytical solutions are given in Appendix A. Reasonable error magnitudes are O(10 −2 ) ; however, the top panel of Figure 2 shows that no modes computed with finite differencing satisfy this condition. The situation is even worse for the K = 2 500m case shown in the bottom panel, where even the lowest mode has an O(0.1) error. The blue dotted line in Figure 2 shows that increasing the resolution tenfold for second-order finite differencing decreases the error by a factor of 100, as would be expected. However, this comes at 1,000 times the computational cost and still barely produces any usable modes.
The accuracy of finite differencing can be increased by going to higher orders (Fornberg, 1998), since the truncation error at order s scales as (Δz) s . The truncation error of the 10th mode in exponential stratification is shown in the top panel of Figure 3 where the blue (solid, dotted) line shows the (second and sixth)-order finite difference method converging at its predicted rate. The bottom panel shows that even with 1,024 grid points, there are only eight usable modes for the second-order finite differencing method, while sixth-order gives up to 50 modes. However, while increasing the order of the method does provide some gains in accuracy, the most efficient way to proceed is simply to use spectral methods, which promise exponentially decreasing truncation error, rather than the polynomial truncation errors offered by finite differencing. When using an analytical density function (dashed lines, Figure 3) rather than gridded data, there is no interpolation error and the spectral methods truncation errors reach a noise floor somewhere between 64 and 128 grid points. Furthermore, the number of usable modes is an order of magnitude higher than even the sixth-order finite difference method. In practical terms, the second-order finite difference method is producing about 10 good modes in 100 s, while the spectral methods are producing about 100 good modes in 1 s. The increase in truncation error at higher resolution is likely due to increasingly compounded errors of the eigenvalue solvers. Figure 3. The top panel shows relative error as a function of resolution for the 10th mode in exponential stratification with K = 2 500m . The density function is specified on an evenly spaced grid (solid lines) or passed directly as an analytical function (dashed lines). The bottom panel shows the number of usable modes as a function of resolution, defined as the number of modes with truncation errors less than 10 −2 . The convergence rates of the second-order and sixth-order finite difference methods are found to be (Δz) 2.0 and (Δz) 5.8 , respectively.

Chebyshev Polynomials
Written in matrix form the (K-constant EVP) is where v is the vector representation of the normal mode G at grid points in z, To use Chebyshev polynomials, we project vector v onto a Chebyshev basis usingv = T −1 v, where T is the matrix that transforms a vector from a Chebyshev basis to the coordinate basis. In a practical sense, the columns of T are the Chebyshev polynomials. Then the EVP becomes or simply The vectorv contains the coefficients needed to reconstruct eigenfunctions, and zz T are the second derivatives of the Chebyshev poynomials.
The optimal choice of grid for Chebyshev polynomials is a Gauss-Lobatto grid (Boyd, 2001;Canuto et al., 2006); for example, equation (31) below and thus the eigenmatrices and eigenfunctions are always created on a Gauss-Lobatto grid for any chosen coordinate. Because the basis functions are continuous functions of z, the resulting vertical modes can be interpolated onto any grid at any resolution by evaluating the functions at the points of interest.
It is, however, rarely the case that density is given as an analytical function or that observations are made on a Gauss-Lobatto grid, which means that typically the density needs to be interpolated on the appropriate grid. Interpolation is performed using B-splines implemented with the numerical framework described in Early and Sykulski (2020). The advantage to using B-splines to represent gridded density data is that it is easy to accommodate arbitrary grids. Despite being a low-order method, this is generally not a limitation (see section 6). In the cases shown in Figure 2, the algorithms are given the density ( ) on a uniform grid in z of 64 points and the resulting modes are returned on the same grid (except where noted for the high-resolution finite differencing case). This is, of course, suboptimal for the spectral cases which use a Gauss-Lobatto grid on various coordinates to compute the EVP. When given an analytical function for density, these methods perform even better, as can be seen in Figure 3.
Despite the potential limitations imposed by interpolating the density with B-splines onto a Gauss-Lobatto grid, the red line in Figure 2 shows that the Chebyshev method performs extremely well, even while interpolating from an evenly spaced grid and outputting to the same grid. The first 20 and 14 modes have error less than O(10 −2 ) for the K = 0 and K = 2 ∕500m −1 cases, respectively. However, at higher horizontal resolution (larger wavenumbers K), even the spectral method's errors grow large, because the points at which the functions are evaluated do not sufficiently capture the oscillations of the modes. This can be remedied by using a stretched coordinate, s.

Stretched Coordinates
In order to find an independent coordinate better suited to capturing the structure of the eigenmodes, we rewrite the EVPs in terms of a generic coordinate s and then consider two concrete examples. Taking z to be a function of s and applying the chain rule leads to and EARLY ET AL. For example, the (K-constant EVP) becomes where now F = h( s z) −1 s G. The free surface boundary condition in these coordinates becomes ( s z) −1 s G = G∕h, while the normalization conditions are now and A necessary condition for using stretched coordinates is that the function s(z) must be strictly monotonic.

Density Coordinates
For density coordinates s = −ḡ∕ 0 , equation (25) can be written as where F = hN 2 s G. Note that the derivatives of the density are computed on the z coordinate then projected onto the s coordinate in the EVP. This avoids using inverses of functions that tend towards zero and therefore has greater numerical stability. While the method does well for the high wavenumber case (Figure 2, lower panel), it performs somewhat poorly with a uniform relative error of O(0.03) for all modes in the low wavenumber (upper panel), as shown by the orange line. Evidently, density coordinates cluster points inefficiently in this case.

WKB Stretched Coordinates
A compromise between depth (z) and density (̄) coordinates is the WKB stretched coordinate, s = ∫ z D N(z ′ ) dz ′ . In this case the EVP becomes where F = hN s G. Figure 5. Relative error as a function of vertical mode number using 64 evenly spaced grid points for the frequency and stratification shown in Figure 4. Shown is the WKB scaled spectral method (purple) as in Figure 2 and also the adaptive grid method (green) that clusters points near the regions of oscillation. Only the first 16 modes are shown.
The purple line in Figure 2 shows that the vertical modes computed on WKB coordinates have uniform accuracy of O(10 −3 ) for K = 0, outperforming the density coordinate case and also performing nearly as well as density coordinates in the high wavenumber case.

Adaptive Grid for -Constant EVP
Solving the ( -constant EVP) suffers from the additional challenge that as the frequency increases and the distance between turning points decreases, the grid spacing necessary to capture the mode structure becomes ever smaller. As noted in section 1, this issue arises when considering internal waves near a pycnocline. An example stratification profile and vertical mode found at a frequency approaching the maximum frequency in the pycnocline is shown in Figure 4. The relative error as a function of mode for this example is shown in Figure 5, from which it is clear that even the WKB stretched coordinate that performed so well for the (K-constant EVP) does relatively poorly in this scenario. Sturm-Liouville theory tells us that the nth F mode will have n zero crossings in the oscillatory region where N 2 (z) > 2 (Arfken, 1970). Thus, in order to resolve these oscillations, one would require at least 2n optimally placed points in that region, as well as additional points to capture the variance in the decay regions. Simply increasing resolution of the Chebyshev grid cannot efficiently solve the problem, as grid points will continue to be poorly placed.
To resolve this issue, we devise an ad hoc method for clustering points in regions of interest. Our approach is to partition the domain into regions where the modes are hypothesized to be nonzero, formulate the EVP for each region (using WKB stretched coordinates), then couple the equations at the region boundaries. This enables us to assign most of the points to the regions where the solution is assumed to be nonzero, and allocate a few remaining points in the other regions. A comparison of this adaptive method and the standard WKB stretched coordinates is shown in Figure 5, where the adaptive method is able to capture a few more usable modes than the standard WKB stretched coordinate method with one EVP. The value of this method becomes more pronounced as the maximum frequency is reached. The number of usable modes (error < 10 −2 ) drops to zero as the maximum buoyancy frequency is approached when using the single EVP, as shown in Figure 6. However, using the adaptive grid algorithm, we are able to guarantee a minimum number of usable modes as points cluster around the turning frequencies.
The equation boundaries are established by using the WKB approximated solution to identify the regions where the modes are expected to be nonzero. Specifically, the equation boundaries are the points where WKB solution (A40) decays to 10 −5 of its value from the turning point. This is an adjustable tolerance, chosen to be small enough that only a few points are needed to capture the nearly zero function but large enough that the nonzero regions aren't unnecessarily large. The gray vertical lines in Figure 6 show the number of coupled equations being used to solve the EVP. At the lowest frequencies only one equation is used and the method is identical to the WKB stretched coordinate method described in section 4.3. As the frequency increases, the algorithm eventually separates into two coupled equations: one for the top boundary and pycnocline and another for the deep region where no mode variance is expected (refer to Figure 4). At high enough frequency the region above the pycnocline is decoupled as well, and three coupled EVP problems are solved.
The EVPs are coupled by requiring that the function and its first derivative are continuous at the equation boundaries, following the procedure described in section 22.3 of Boyd (2001). The "eigenvalue rule-of-thumb" as discussed in Boyd (2001) states that roughly n∕2 modes will be accurate when using n + 1 Chebyshev polynomials away from boundary layers or critical levels. Solving the EVP with turning points near the maximum buoyancy certainly does not satisfy this criterion, but the rule-of-thumb can be modified to use half of the modes with eigenvalues greater than zero. Although we make no attempt at proving the general validity of this modification, the dashed line in Figure 6 indicates that the rule-of-thumb generally does well at predicting how many modes are good quality.

Numerical Implementation
One of the primary products of this paper is the implementation of these methods as classes in Matlab (see Appendix C for more details). Figure 7 shows the flowchart followed by the initialization algorithm for the InternalModes class, described in this section.
The two methods for initializing the classes are both called using im = InternalModes(rho,z,zOut,latitude); where the arguments rho,z are either a gridded density field at locations z or function handle valid in the domain spanned by [min(z) max(z)]. When the function handle is given, the density function is projected onto Chebyshev polynomials. If gridded data are provided, then the density is interpolated using B-splines. The argument zOut specifies the grid on which all output is given, which need not span the full depth.
After initialization, all classes support setting the upper and lower boundary conditions as well as setting the normalization to any of the choices discussed in section 2.4.
The two primary functions for computing internal modes are for the ( -constant EVP).
The implementation of these methods for finite differencing is straightforward-the EVP is solved either on the gridded input data as given or on a grid that matches the output grid if specified as a function. The differentiation matrices are created using the algorithms described in Fornberg (1998). However, the spectral implementations require additional choices.

The EVP being solved is
[

a(s)T ss + b(s)T s + c(s)T
where s is a generic stretched coordinate, a(s), b(s), c(s), and d(s) are referred to here as coefficient functions.

10.1029/2019MS001939
The algorithm can be separated into the three parts. First, we compute the coefficient functions for each EVP, for example, N 2 and z N for equation (29). Second, the EVP is solved on the appropriate coordinate with n evp points. Finally, the resulting modes are normalized and projected onto the output grid with an arbitrary number of points.

Initialization With an Analytical Function
If the InternalModes class is initialized with a function handle for the density, it is projected onto Chebyshev polynomials which are then used to compute the coefficient functions and, if necessary, the stretched coordinate.
To project onto Chebyshev polynomials, we define a grid with n grid points using where i is an integer index ranging from 0 to n grid −1. We evaluate the density function on that grid,̄(z i lobatto ). The density function is then expanded in a Chebyshev polynomial basis such that wherêk indicates the kth coefficient for Chebyshev polynomial defined on z coordinates. The coefficients for the derivative of the function, denoted̂k z , are then computed using a recursion formula where c k = 2 for k = 0 and c k = 1 otherwise. Because a Gauss-Lobatto grid was used for the z-coordinate, the Chebyshev transformation is performed with a rescaled fast Fourier transformation in O(n grid log n grid ) operations. The differentiation requires only O(n grid ) operations, which means that all of the coefficient functions for the EVP can be computed on a relatively fine grid. For example, n grid = O(2 14 ) takes a fraction of a second on consumer hardware from 2015.
The stretched coordinates implemented here are either s = z, s = −ḡ∕ 0 , or s = ∫ z D N(z ′ ) dz ′ , where the latter two cases require density to be strictly monotonic. For those two cases if̄z ≥ 0 anywhere in the domain, then an error is thrown. For the WKB coordinate, s = ∫ z D N(z ′ ) dz ′ , the integral is computed spectrally using equation (33).

Initialization With Gridded Data
In the more typical scenario where a user initializes the InternalModes class with gridded data from observations or a numerical model, B-splines are used to interpolate the data and compute the coefficient functions. The primary advantage to using B-splines in this scenario is that B-splines can be created for arbitrary grids without suffering from Runge's phenomena at lower orders. We fit the data to sixth-order interpolating spline (with five nonzero derivatives) using the methodology and numerical implementation described in Early and Sykulski (2020).
If the method requires that density remain monotonic (e.g., for WKB and density coordinates), then the B-spline fits are constrained to be monotonic following Pya and Wood (2015). If the data are not monotonic, then this implicitly smooths to find the nearest monotonic fit in a least squares sense.
Computing the stretched coordinate and the coefficient functions from the spline interpolant requires derivatives and integrals of the B-splines, which are relatively straightforward to compute because they are just piecewise polynomials (De Boor, 1978). The WKB method requires computing the square root of the B-spline interpolant. The approach taken here is to build a new interpolating spline of the same order that interpolates between the square root of the data points.

EVP
All three Chebyshev methods solve their respective EVPs on a Gauss-Lobatto grid of their respective coordinate, that is, s = z, s = −ḡ∕ 0 , or s = ∫ z D N(z ′ ) dz ′ . The Gauss-Lobatto grid in s is defined in the usual way as

10.1029/2019MS001939
where the number of points n evp reflects the size of the EVP and is therefore also an absolute upper bound to the number of modes that may be computed.
Once the Gauss-Lobatto grid s i lobatto is created, the corresponding value z(s i lobatto ) is computed using the bisection method as implemented in Driscoll et al. (2014). The method is set to terminate with a relative error of O(10 −12 ).
The coefficient functions in equation (30) are now simply evaluated onto the s i lobatto grid using the interpolant (either a B-spline or Chebyshev). The Chebyshev polynomials and their derivatives (T, T s , and T ss ) are computed using the standard recursion formulas and then multiplied by the coefficient functions to create eigenvalue matrices. The boundary conditions are implemented by replacing the first and last rows of the matrices A and B in (20).
The EVP is then solved using the standard generalized EVP solver. This is typically the rate limiting step in the process, taking O(n 3 evp ) operations. The resulting eigenvectors now contain coefficients to the Chebyshev polynomials defined on the stretched coordinate.

Adaptive Grid
The adaptive grid is created by locating regions in the domain where the solution is expected to be small and then allocating fewer points (and therefore fewer Chebyshev polynomials) to those regions. After identifying the turning points z T where N 2 (z T ) = 2 , the WKB solution (A40) is used to identify the equation boundaries z bnd , the points where F WKB (z bnd , )∕F WKB (z T , ) = 10 −5 . The WKB solution is assumed to work locally in the stratification and is therefore applied at the turning point, z T , in the direction of decaying variance. The eigenvalue (A42) is assumed to be set globally by integration over all oscillatory regions. If m equation boundaries z bnd are identified, they delineate m + 1 regions: the "decay" regions, where the solution is anticipated to be small and governed by the decaying exponential and "oscillatory" regions where the solution is expected to dominate and include the oscillatory solution. These decaying and oscillating regions are necessarily alternating.
The m+1 regions are coupled using the same technique described in section 22.3 of Boyd (2001) by requiring continuity across boundaries for G and z G. The key benefit to this algorithm is that the decay regions are allocated as few as 6 points each, while oscillatory regions are apportioned the remaining points relative to their WKB length, L m = ∫ N(z ′ )dz ′ . While we find that 6 points appear to be sufficient for the decay regions, in practice, we do in fact apportion 1/16th of the total points evenly between the decay regions as a hedge that this ad hoc method will fail for some unforeseen cases.
The adaptive grid algorithms use low-order interpolation to identify z T and z bnd because high accuracy is not required, and this keeps the number of computations O(n grid ).

Normalization
The final step is to normalize the resulting eigenvectors and project them onto z out . Normalization using the two integral conditions can be performed exactly invoking the fact that the integral of each Chebyshev polynomial T k (z) is exactly known We have defined w k such that it can be summed with a vector of Chebyshev coefficients to produce the integral. In other words, ifv k is a Chebyshev coefficient vector, then I = ∑v k w k is the definite integral. The integrands in equations (26) and (27) are computed pseudospectrally before integrating (by transforming to the spatial domain, multiplying, then transforming back Chebyshev coefficients).
Computing the max U and and max W norm is more problematic because the function extrema do not necessarily lie on the grid points. For the implementation, we simply take the maximum at the resolved grid points, but if higher accuracy is required, one could locate the extrema using the methods in Boyd (2014).
Finally, the normalized eigenmodes are projected onto the output grid using the slow Chebyshev transforms, If a large number of output points are requested, this operation could dominate the total computation time.
The algorithm flowchart for the mode computation is shown in Figure 8

SQG Modes
The two functions for computing the SQG modes are psi = im.SurfaceModesAtWavenumber(k); and psi = im.BottomModesAtWavenumber(k); where k is an array of wavenumbers.
The SQG modes are found from equation (B2) using a linear solver after replacing the top and bottom points of the matrix with the boundary conditions. As noted in Tulloch and Smith (2009a), the SQG modes require a high density of grid points near the boundaries, a task well suited to the Gauss-Lobatto grid in equation (31). The number of Chebyshev polynomials is chosen so that the Gauss-Lobatto grid captures at least 10 points over the e-folding scale. The e-fold scale for constant stratification (see Appendix B.1) is Δz efold = 0 KN 0 , and the distance between the first two points in a Lobatto grid, equation (31), is where we have defined D = z max − z min as the depth of the domain. Setting Δz boundary = 1 10 Δz efold , we find that we need points (and therefore also n grid polynomials) to sufficiently capture the SQG mode. The resulting SQG modes are projected onto the output grid using equation (36).

Unit Testing
In order to ensure that each of the algorithm implementations is correct, the numerically generated eigenmodes and eigenvalues are compared against the analytical solutions for constant stratification and exponential stratification shown in Appendix A. The comparison is performed across a range of wavenumbers and frequencies for both surface boundary conditions and all four norms.
The computed SQG modes are also compared against analytical solutions for constant and exponential stratifications, shown in Appendix B.

Other Sources of Error
In section 1 we noted five sources of errors that contribute to the total error when computing the forward or inverse transformation, but this manuscript has primarily focused on one source of error: the numerics of accurately representing the vertical modes in the EVP. We now discuss these other sources of error and how they are dealt with in the numerical implementation.

Aliasing Error
When performing a forward transformation, where a given dynamical field is projected onto the vertical modes, the data grid will determine how many modes are resolvable. As with a Fourier transformation, higher-frequency modes alias into the lower-frequency modes. Unlike the Fourier transformation, however, the optimal grid for performing a transformation is not an evenly spaced grid but depends on the stratification profile, and therefore, the EVP being solved. Here we show that there is a relatively easy method for determining the number of resolvable modes for a given grid using the condition number of the resulting matrix.
To show the effect of different grid choices on the forward transformation, we use the analytical solution of the vertical modes in exponential stratification in combination with an imposed spectrum to generate stochastic isopycnal displacement profiles typical of the world oceans. In particular, we use the Garrett-Munk spectrum, where * is the roll-off mode, usually set to 3 but possibly as high as 20, p is the slope which is usually very nearly 5∕2, and H 0 normalizes sum over = 1..∞ to unity. For each stochastically generated set of coefficients, m , a profile (z) = ∑ N =1 G (z)m is created. The profile is then evaluated on three different grids: z i even , z i lobatto , and z i quadrature , where z i quadrature is the grid of Gaussian quadrature points, determined by the roots of a mode one higher than is trying to be used (Boyd, 2001;Press et al., 1992). Using the first n modes, we then attempt to recover the coefficients using least squares-in practice this is Matlab's mldivide (∖) operator. For example, the first n coefficients of the evenly spaced grid are determined bỹ where = 1..n. The root-mean-square error (rmse) is defined as the error in the sum of recovered and missing coefficients (41) Figure 9 shows the result of trying to recover the mode coefficients,m using successively more modes for the three different grids. In all three cases the rmse decreases as modes are added until a dramatic increase occurs, correlating with a similarly dramatic increase in the condition number of the matrix. The quadrature grid, defined as the roots of the G N−1 mode plus the boundaries, performs best, as expected. Including the boundaries in this definition means that the top and bottom boundary points provide no useful information, and therefore, only N − 2 modes are recoverable for G and N − 1 for F with a rigid lid. This definition is chosen so that the forward and inverse transform for constant stratification coincides with the discrete sine and cosine transform (and their associated grids).
While the condition number of the matrix is clearly controlled by the grid being used, the choice of norm also affects the condition number. Generally speaking, the K-constant norm performs well for transformations with the G modes and the -constant norm with the F modes. The exception to this is when the free surface boundary condition is used, the barotropic mode has a substantially different L 2 norm and should be rescaled.

Mode Error
Here, we test whether or not the truncation error in the vertical modes is a limitation in recovering the coefficients of the spectrum, m . To this end, we created profiles on a quadrature grid with variance distributed using H( ) for a range of parameters including white noise (large ⋆ ), and very smooth (large p), as described above, and recorded how many mode coefficients were recoverable for some error tolerance across all the different numerical methods in this paper. We found that discarding modes with truncation errors exceeding the requested error tolerance of the modes guarantee coefficients recovered with the requested error tolerance. In other words, if one wants relative errors in coefficients of less than 10 −2 , one needs modes with errors less than 10 −2 . The only exception is for very steep spectra (e.g., p = −10), where the coefficients of the higher modes are indistinguishable from zero. The reverse of this is not true-in fact, including modes with truncation errors exceeding the error tolerance can often return coefficients within the bounds of the error tolerance. Evidently, the errors in these ordered, orthogonal bases work systematically in our favor.

Interpolation Error
Another source of error may arise from interpolation of the density function. This issue is treated separately from a poorly defined or noisy mean density function and therefore assumes that the data given are gridded and without error. For gridded data, our method uses a relatively low order B-spline to interpolate between grid points which is then evaluated for the coefficient functions where needed. Does this low-order interpolation method limit the mode recoverability described in the previous section?
To address this question, we compute the vertical modes for exponential stratification from an evenly spaced density function with variable number of grid points. These modes are then used to recover the coefficients m , as above, on a high-resolution quadrature grid. We find that with as few as 16 grid points for the density function, we are able to recover 90 mode coefficients with less than 1% error.

Poorly Defined or Noisy Mean Density
It is often the case that the mean density function,̄(z), is not easily defined. Averaging over a mooring time series of density will not necessarily result in a monotonic density function-and averaging often removes the sharp gradients that exist in individual profiles, which may not be desirable. Even output from numerical models can suffer from these same issues, depending on the boundary conditions. Noisy data, where errors in the observed value of̄(z) stem from instrument errors, also effectively constitute a poorly defined mean. One way to frame these issues is to ask how a misspecified mean density affects our ability to infer the vertical spectrum of a given flow.
First, we note that all methods, as implemented, will proceed without error for noisy data. However, the most notable difference between methods is that the WKB and density coordinate methods use a density function constrained to the nearest monotonic spline fit, as previously described. It is not clear that this implicit smoothing is necessarily "better" than using the unaltered density function with the z-coordinate method for noisy data. This decision, and how to deal with measurement noise in general, is beyond the scope of this manuscript. Additional smoothing of the density data can be done using many techniques, including the constrained smoothing splines described in Early and Sykulski (2020).
In order to test the effects of misspecifying a mean density profile, we generate density profiles in exponential stratification that follow the GM spectrum, as above, but we attempt to recover the coefficients using vertical modes computed from a noisy mean density profile. The results are consistent with section 6.2. For example, as long as the modes from the noisy profile have errors less than 10 −2 relative to the modes generated from the true profile, the recovered coefficientsm will have errors less than 10 −2 relative to m . The relative error of the vertical modes increases as a function of mode number until eventually the mode errors and therefore coefficient errors reach O(1). Exactly where mode errors reach O(1) depends on the details of the noise, or misspecification, of the mean density profile.
While accurately recovering the coefficients m of the spectrum becomes impossible with a noisy mean density profile, we are able to accurately infer the spectrum from which m was generated by either ensemble averaging over additional synthetic profiles or bin averaging nearby modes. One way to see why this might be true is to note that the coefficient errors never exceed O(1)-so although the exact coefficient is incorrect, the magnitude is correct on average. In practice, using modes from the misspecified mean density profile causes variance that should be associated with mode = 33, for example, to be assigned to the variance of nearby modes.
That the spectrum is recoverable despite a noisy mean is consistent with previous analysis methods. In Polzin and Lvov (2011), internal wave spectra are found using WKB approximated modes and a WKB stretched grid. It is also important to note that computing the spectrum with a misspecified mean density function still requires an orthogonal set of modes, and therefore, fast and accurate mode computation is still helpful.

Discussion
The spectral methods presented here solve the most relevant forms of the vertical eigenvalue and surface quasigeostrophic mode problems efficiently and accurately. The methods also include an algorithm for computing modes in challenging stratification profiles at high frequencies near turning points. The algorithms are implemented in a publicly available Matlab suite (https://github.com/JeffreyEarly/GLOceanKit/) using the class hierarchy described in Appendix C, and the implementations are validated against known analytical solutions, under a wide range of conditions. However, the methods do not always perform well under all conditions.
Poorly resolved features and discontinuities in the density profile will produce the Gibbs phenomenon, where "ringing" occurs in the vicinity of the discontinuity. In one example, we found that a narrow 5-m-wide pycnocline in a 5,000-m-deep ocean produced strong spurious oscillations unless the pycnocline was sufficiently well resolved with enough grid points. In another example, we defined an analytical profile with a discontinuity in̄z z (the highest derivative used in the EVP) and this also produced the Gibbs phenomenon. Interestingly, in these cases lower-order finite differencing produced better modes for the same vertical resolution, because these methods implicitly smooth the derivatives. A logical extension of this work would be to apply the "splitting" algorithms in chebfun (Driscoll et al., 2014) to handle such discontinuities.
Our recommendations for projecting observed or modeled fields onto the modes are as follows: • When possible, use a quadrature grid for the fields and modes, for example, the function GaussQuadra-turePointsForModesAtWavenumber in the InternalModesSpectral class computes the Gauss quadrature points for G. This is a relatively expensive operation (it involves solving the EVP) but provides near-optimal point placement. • In the usual case where there is no freedom to choose grid points, compute the condition number as a function of mode as described in section 6.1 and limit the number of modes to a low condition number. Alternatively, if computation time is not a limitation, one can perform the least squares fit of the fields to successively more modes until the coefficients are no longer stable.
Many of the errors described in section 6 may still be a concern but may be quantified with some of the techniques in section 6, by using the density function and spectrum specific to the problem.

Appendix A: Internal Mode Solutions
We present analytical solutions for the internal mode eigenvalue problem in three different scenarios: constant stratification, exponential stratification, and the WKB approximated solution for arbitrary stratification. These solutions are used to validate the numerical implementations.

A.1. Constant Stratification
The internal baroclinic modes in constant stratification are given as with eigendepth h and vertical wavenumber m given by where we have assumed that w = 0 at the lower boundary z = −D. In the case of a rigid lid (w = 0 at z = 0), the correction to the vertical wavenumber is = 0. However, if the linearly approximated free surface boundary condition is used, h G (0) = G (0), then is nonzero. The equations for are transcendental and are therefore solved with a numerical root finding algorithm. The equations for are written in a form conducive for finding the desired root.
• For fixed wavenumber, k, the vertical wavenumber correction is found by solving near = 0 and the eigendepth h is given by • For fixed frequency, , the vertical wavenumber correction is found by solving D(N 2 0 − 2 ) − g( + ) tan( ) = 0 (A6) near = 0 and the eigendepth h is given by The normalization for these modes is given in the first column of Table A1.
In the case of the free surface boundary condition, there also exists a barotropic mode ( = 0), the solution of which changes from trigonometric to hyperbolic at = N 0 , or k = k * where In these cases then, the vertical wavenumber reduces to m 0 = ∕D.
• Trigonometric case, k < k * or < N 0 The solution is exactly the same as the baroclinic solutions given above in equations (A1) and (A2), but now equation (A4) is solved when = 0. As a practical matter, it is numerically more stable to solve where h 0 = D.
• Hyperbolic case, k > k * or > N 0 The solution is given by −For fixed wavenumber, k, the vertical wavenumber correction is found by solving ( for the root near =

A.2. Exponential Stratification
Exponential stratification serves as the O(1) representation of the ocean stratification away from the poles and continental boundaries. As formulated and first solved in Garrett and Munk (1972), we take the stratification to be N 2 = N 2 0 e 2z∕b , where N 0 is buoyancy frequency and b is the thermocline depth (with canonical values of three cycles per hour and 1,300 m).
Letting the stretched coordinate s = N 0 e z∕b as in section 4.1, the ( -constant EVP) becomes is chosen to satisfy the lower boundary condition G(−D) = 0. The Bessel function Y (s) has a singularity at s = 0, so for many choices of and c , ≪ 1 and the second term needs to be neglected for stable numerical evaluation. The order of the Bessel function is set by the frequency = b c , or, using the dispersion relation wavenumber = √ b 2 2 0 c 2 +b 2 k 2 . The F modes are found by taking the derivative The discretization into modes is a result of applying the second boundary condition, in this case either a rigid lid G(0) = 0 or a free surface G(0) = F(0) and then finding the eigenmode speeds c i = √ gh i that satisfy those conditions. The c i s are therefore determined by the roots of Bessel functions, for which there is no general closed form solution. The challenge then becomes finding bounds for the roots and writing the equation in a form suitable for a root finding algorithm.

10.1029/2019MS001939
In practice, it is easiest to find the inverse of the c i s, so we let x = bN 0 c and then write the root equation, (x) = 0, in terms of parameter function (x) and s(x). Again, stable numerical evaluation requires that we work around the singularity of Y , so if (x) < s(x)e −D∕b , then we take and when (x) > s(x)e −D∕b we take where in either case the rigid-lid condition requires and the free surface requires • -constant solution The parameter functions are defined as where the scaling factor is chosen from the WKB approximated solution (Desaubies, 1973), − > N 0 e −D∕b The root equation must be taken to be big (x) and (from Desaubies, 1973, equation 2.18); the first n modes are found within [0.5, n + 1].

−otherwise
The root equation must be taken to be small (x) and where = 0 N 0 and = bk. Desaubies (1973) provides low-frequency (lf) and high-frequency (hf) bounds for the eigenvalues, which we can be written in terms of wavenumber