Geostatistical Analysis of Mesoscale Spatial Variability and Error in SeaWiFS and MODIS/Aqua Global Ocean Color Data
Abstract
Mesoscale (10–300 km, weeks to months) physical variability strongly modulates the structure and dynamics of planktonic marine ecosystems via both turbulent advection and environmental impacts upon biological rates. Using structure function analysis (geostatistics), we quantify the mesoscale biological signals within global 13 year SeaWiFS (1998–2010) and 8 year MODIS/Aqua (2003–2010) chlorophyll a ocean color data (Level-3, 9 km resolution). We present geographical distributions, seasonality, and interannual variability of key geostatistical parameters: unresolved variability or noise, resolved variability, and spatial range. Resolved variability is nearly identical for both instruments, indicating that geostatistical techniques isolate a robust measure of biophysical mesoscale variability largely independent of measurement platform. In contrast, unresolved variability in MODIS/Aqua is substantially lower than in SeaWiFS, especially in oligotrophic waters where previous analysis identified a problem for the SeaWiFS instrument likely due to sensor noise characteristics. Both records exhibit a statistically significant relationship between resolved mesoscale variability and the low-pass filtered chlorophyll field horizontal gradient magnitude, consistent with physical stirring acting on large-scale gradient as an important factor supporting observed mesoscale variability. Comparable horizontal length scales for variability are found from tracer-based scaling arguments and geostatistical decorrelation. Regional variations between these length scales may reflect scale dependence of biological mechanisms that also create variability directly at the mesoscale, for example, enhanced net phytoplankton growth in coastal and frontal upwelling and convective mixing regions. Global estimates of mesoscale biophysical variability provide an improved basis for evaluating higher resolution, coupled ecosystem-ocean general circulation models, and data assimilation.
Key Points
- Geostatistical techniques isolate mesoscale variability independent of satellite platform
- Unresolved variability scales proportional to surface chlorophyll except low-chlorophyll regions
- Physical stirring of large-scale gradient supports observed mesoscale variability in surface chlorophyll
1 Introduction
Before remote sensing satellites provided regular, near-synoptic views of ocean basin scale surface distributions of phytoplankton, conferences (e.g., Steele, 1978) were being held to discuss and debate the character of, biophysical mechanisms responsible for, and means of measuring plankton patchiness. This interest was driven by a desire to predict both the size of these plankton patches and understand their underlying dynamics. This knowledge is sought to improve our ability to use satellite data to guide models (data assimilation) to assess the impact of marine ecosystems (e.g., fisheries) upon the ocean global role in planetary climate modulation. Initially, much of the characterization and measurement was carried out through 1-D spectral analysis of chlorophyll a distributions from field measurements along ship-tracks (Platt, 1972), theory (Steele, 1974), or theory and measurements (Denman & Platt, 1976). With time, the mesoscale variability patterns displayed by plankton in the surface layer of the ocean (of which patches are just one) came to be associated with the interaction of turbulent mixing and horizontal gradients of tracer distributions (Abraham, 1998; Bennett & Denman, 1985; Franks, 2005; Garrett, 1989, 2006; Lévy, 2003; Lévy & Martin, 2013; Mackas et al., 1985; Powell & Okubo, 1994, to name just a few). A few, more adventurous, researchers attempted to answer some of these questions with structure function analysis (Denman & Freeland, 1985; Yoder et al., 1987) once satellite images became digitally available. It has been over 40 years since the early 1970s and characterizing the nature and causes of patchiness continues to be a challenge (van Gennip et al., 2016). Into this milieu we offer our structure function analysis of the remotely sensed, mesoscale resolution distributions of surface chlorophyll a derived from global, multiyear data sets from NASA's Sea-viewing Wide Field-of-view Sensor (SeaWiFS) and Moderate resolution Imaging Spectroradiometer onboard the EOS-AM satellite (MODIS/A).
The idea of using remotely sensed data to illuminate features of scalar variables near the sea surface has been around for a long time. In the past, spatial scale analysis has been accomplished largely with spectral methods requiring various data smoothing preprocessing to avoid problems introduced by data losses due to masking clouds and optically thick aerosols, or sensor performance issues (Denman & Abbott, 1988, 1994; Denman & Platt, 1976; Gower et al., 1980). Some of these drawbacks can be ameliorated with the use of structure function analysis (Doney et al., 2003; Denman & Freeland, 1985; Yoder et al., 1987). Denman and Freeland (1985) present one of the earliest applications of spatial structure functions (variograms) to oceanographic measurements of chlorophyll a taken shipboard during a 3 year period off the west coast of Vancouver Island, Canada. Their efforts to estimate the correlation functions and unresolved noise levels for the purpose of objective mapping of four oceanographic variables, including log-transformed phytoplankton chlorophyll a values had no obscuration by clouds. Yoder et al. (1987) followed with an application of structure function analysis to Coastal Zone Color Scanner (CZCS) data from the southeast U.S. continental shelf. Their results demonstrate that satellite data (with cloud induced data dropouts) could be used as input to structure function analysis in the mesoscale range of oceanographic variability. Doney et al. (2003) present monthly, globally distributed structure function results of spatial variability for 1 year (1998) of daily, Level-3 (L-3) standard mapped image (SMI) SeaWiFS data. In this paper, we expand upon their work and present an analysis of both the geographic patterns of mesoscale spatial variability and their corresponding time series for the 13 year (1998–2010) and 8 year (2003–2010) periods of L-3 SMI (nominal 9 km resolution) data from SeaWiFS and MODIS/A sensors.
In this study, we use structure function analysis, also known as variogram analysis (Clark, 1979; Journel & Huijbregts, 1978), to explore the spatial distribution of mesoscale variance in ocean color imagery. By analyzing the distribution of variance spatially we can separate significant resolved mesoscale biological signals from smaller-scale unresolved geophysical variability and systematic noise. Additionally, structure function determination is a necessary prerequisite for objectively mapping ocean color imagery (Bretherton et al., 1976). Previously, spectral methods have been the mathematics of choice to explain patchiness in variability of phytoplankton chlorophyll a (Bennett & Denman, 1985; Denman et al., 1977; Franks, 2005; Garrett, 1989, 2006; Lévy & Martin, 2013; Mackas et al., 1985; Martin, 2003; Powell & Okubo, 1994) in theoretical studies and (Denman & Abbott, 1988, 1994; Denman & Platt, 1976; Gower et al., 1980; Platt, 1972; Shi et al., 2015; Smith et al., 1988; van Gennip et al., 2016) in remotely sensed and shipboard data studies. Many of these previous studies either averaged several images together or carefully chose only clear images to avoid gaps in the data too large to ignore. Our choice of structure function analysis (Yaglom, 1957) was guided by our desire to probe multiple length scales of mesoscale satellite ocean color data comparable to spectral analysis but without the operational problems presented by missing data.
In this paper we present results of our study that are an expanded analysis of the geographic, mesoscale patterns of variability begun in Doney et al. (2003). Estimates of resolved and unresolved variance (uncertainty) as well as decorrelation scale lengths are found for the global, daily 13 year and 8 year time series of SeaWiFS and MODIS/A. We also present a mathematical framework to link the decorrelation scales results from the structure function analysis to mixing scales (
) that act upon horizontal tracer gradients (Garrett, 1989, 2006) producing patchiness. This paper starts with a mathematical derivation of tracer mixing scales related to the root mean square chlorophyll anomaly fluctuations and the large-scale chlorophyll mean gradient followed by a brief discussion of structure function usage and ocean color data sources and processing. In Section 3, we present geographic and temporal ocean color variability at both local and global scales separating resolved, mesoscale phytoplankton variance from apparent smaller-scale geophysical or systematic noise or their combination. Similarities and differences between results derived from SeaWiFS and MODIS/A are also provided. The paper concludes with the global, time varying patterns of decorrelation scales and their relationship with an eddy scale tracer mixing length demonstrating possible geographic distribution of passive and nonpassive tracer behavior of phytoplankton in the upper ocean. We then summarize our results and make a few remarks about the apparent lack of any long-term temporal trend in our results.
2 Data and Methods
2.1 Mathematical Framework
The primary variables analyzed in this investigation are the near-surface chlorophyll a (Chl a) concentration estimates from the ocean color algorithms OC4 and OC3M applied to data from the SeaWiFS and MODIS/Aqua satellite instruments, respectively (O'Reilly et al., 2000). These chlorophyll a estimates are log transformed (Campbell, 1995) to render them into approximately normal distributed data
(see Appendix A on log-normal variables). For more details about the satellite data handling and processing see section 2.3.










2.2 Structure Functions












2.3 Ocean Color Data
Data from both SeaWiFS and MODIS/Aqua (MODIS/A) satellite instruments provide near-global coverage every 2 days in 8 or 36 visible and near-infrared spectral bands, respectively (NASA, 2014a, 2014b). Spectral bands common to both SeaWiFS and MODIS/A are combined into ratios of water leaving radiances to make an estimate of the upper-water column chlorophyll a concentration following the OC4 or OC3M algorithms, respectively (O'Reilly et al., 2000). These data are then processed into various levels of resolution and provided via the NASA Ocean Color web site (http://oceancolor.gsfc.nasa.gov/cms/) by the Ocean Biology Processing Group (OBPG) at NASA's Goddard Space Flight Center (GSFC) as either standard mapped image (SMI) or binned (BIN) data files. Level-3 SMI data are created from L-3 BIN data stored on a nearly equal-area, sinusoidal grid. Level-3 BIN data are, in turn, created from lower processing level (Level-2) data aggregated onto this grid over specified time periods (daily, 8 day weeks, monthly, etc.). The L-3 SMI data are stored on an equirectangular grid introducing some distortion in the east–west directions toward the pole. The distortion follows a simple relationship between the pixels and their geographic location and is compensated for during our analysis.
There are arguably significant differences between the two sensors SeaWiFS and MODIS/A. SeaWiFS data quantization is 10 bit versus 12 bit quantization for MODIS/A data. Further SeaWiFS signal-to-noise ratio (SNR) is on the order of three times lower than MODIS/A. This is of particular importance in the near-infrared bands that are used for atmospheric correction because greater noise in these bands can lead to greater noise in Chl a concentrations, particularly in waters with low Chl a concentrations such as oligotrophic gyres (Hu et al., 2012), where the water leaving signal at the longest wavelength employed in the Chl a algorithm (547 and 555 nm, respectively) is minimal. Additionally, SeaWiFS L-3 SMI data products are produced from the Global Area Coverage (GAC) observational sampling strategy employed by the SeaStar satellite platform, i.e., nominal 1 km observations are subsampled onboard the satellite as one sample saved every fourth pixel every forth scan line. Consequently each SeaWiFS 9 km pixel has 1/16 as many samples in its 9 km bin as does MODIS/A (McClain et al., 2004). Fewer samples translate to larger variance in the mean. Finally, the subsampling scheme has another effect on SeaWiFS data noise, namely that complete information about the light field is not available when SeaWiFS data is postprocessed on the ground. Hence, stray light identification and correction of SeaWiFS data is incomplete (stray light being defined as light in the optical system that was not intended in the design) (Hu et al., 2000; Uz & Yoder, 2004). For these reasons, SeaWiFS data should be expected to have more noise than MODIS/A, and our work helps to quantify its magnitude.
The preparation of ocean color data for analysis in this paper closely follows the procedures in Doney et al. (2003). Briefly, daily L-3 SMI SeaWiFS and MODIS/A data are obtained from the OBPG OceanColor web site. We use reprocessing R2010.0 data for both SeaWiFS and MODIS/A satellite instruments. These roughly 9 km resolution images of chlorophyll a (Chl a) are first log transformed to render them approximately normally distributed (Campbell, 1995) following the “law of proportionate effect” (Aitchison & Brown, 1966). The daily log-transformed images are subsequently spatially low-pass filtered with a 31 day centered averaged (Hamming windowed), spatially moving two-dimensional Gaussian sparse-data, low-pass filter with a half-height width set to one half the filter width (i.e., 100 km and 200 km respectively). Each daily low-pass filter is then subtracted from the unfiltered daily image to produce a daily anomaly field (all missing data are set to a default value). For comparison, Doney et al. (2003) used calendar monthly, block averaged means as the subtrahend in the formation of the daily anomalies.
2.4 Analysis
The procedures for creating and analyzing the structure functions (variograms) derived from ocean color data have both similarities to and differences from those applied in Doney et al. (2003). Here, as before, the global, daily anomaly fields (
) are divided into nonoverlapping
subsets and the empirical variograms are computed from those subsets daily. For each 5° cell, the nominal lag distance of 9 km is set by the resolution of underlying L3 SMI data, corrected for the latitude and direction analyzed. In Doney et al. (2003), the structure functions were computed from all-possible data pairs in the spatial domain for the North-South and East-West directions only. However, this approach was found to be computationally inefficient. In this paper, fully two-dimensional variograms are computed for each 5° daily anomaly field with a Fast Fourier Transform (FFT)-based variogram algorithm (Marcotte, 1996).
The daily empirical variograms are further processed by compositing them on a calendar monthly basis to form monthly averaged, data pair weighted, empirical variograms. From these monthly two-dimensional, empirical variograms the North-South and East-West directional semivariances and number of data pairs are extracted as a function of the lag distance between data pixels. A one-dimensional, spherical model structure function, equation 4, is fit to each extracted, monthly empirical variogram with a Levenberg-Marquardt nonlinear regression routine to yield estimates of the model function parameters and their uncertainties.
2.5 Chlorophyll Gradient



The resultant
values are converted to
allowing for the curvature of the Earth. The monthly root mean square (RMS) value of the per km results are then averaged onto the same 5° grid that the geostatistical properties were calculated in either the North-South or East-West directions and displayed in our tables and figures. The gradient algorithm uses a straightforward application of the discretized form of partial derivatives.
3 Results
There are three geostatistical parameters of primary interest in this study. First, unresolved variance (c0) represents variability present in ocean color data that cannot be resolved at that data resolution. It can be variability due to submesoscale geophysical variability, instrument or algorithmic noise, or any combination of the three. Second, resolved variance (
) represents an estimate of the true mesoscale variability in terms of the variance of the log-10 transformed data anomalies. An advantage of displaying variability in terms of c0 and (
) for the log-10 transformed data is that they effectively normalize over the large seasonal and global range in surface chlorophyll. For some applications, however, we also want measures of the absolute variability of surface chlorophyll concentration. As described in Appendix A, both resolved (
) and unresolved (c0) variance can be transformed, using equation A5, into estimates of the corresponding arithmetic variance (and standard deviation) of the original, untransformed data. When plotted as the log of arithmetic standard deviation (
) against log of mean Chl a, a variable with a constant coefficient of variation would plot as a straight line. We use this to bracket the data with parallel contours of constant relative variability as shown in Figures 2 and 4. Third, decorrelation scale length (d) that represents a distance beyond which any two data points will no longer exhibit any regional correlation. The magnitude of this parameter is dependent upon the size of the low-pass filter used to generate the anomalies initially. This parameter tells us how great a distance the chlorophyll a concentration retains some level of statistical correlation and, in anisotropic data fields, is also direction dependent.
3.1 Interannual Variability
Figure 1 shows the time series of
,
, horizontal gradient magnitude, resolved standard deviation (
), and unresolved standard deviation (
) for the 5° square with its northwest corner at 45°N and 40°W along the North Atlantic Aerosol and Marine Ecosystem Study (NAAMES) cruise track (http://naames.larc.nasa.gov/). This figure displays aspects of the ocean color data record that were not explored in Doney et al. (2003), namely interannual variability, coherent seasonal patterns, and horizontal gradient magnitude. At this site (NAAMES), and in virtually every
grid cell, a seasonal cycle in all of the above properties can be clearly seen. In Figure 1a, a maximum surface chlorophyll signal appears in the northern hemisphere late spring-early summer and a lower, secondary, maximum occurs during the late fall, early northern winter. The secondary maximum is more pronounced at some locations than others depending on whether or not the site has nutrient-limited spring bloom characteristics of a mid-latitude site (Behrenfeld et al., 2005; Longhurst, 1995; Siegel et al., 2002).

Time series from along the NAAMES program cruise track at 45°N and 40°W showing (a) log-transformed chlorophyll a, (b) the tracer-based mixing length (
) defined in equation (2), (c) the horizontal gradient magnitude of Figure 1a, (d) the geostatistical resolved variability of SeaWiFS and MODIS/Aqua chlorophyll a, and (e) the geostatistical unresolved chlorophyll a variability of the same satellite instruments. Geostatistical parameters (in Figures 1d and 1e)
and
are in standard deviation units of the
transformed data.
Although there is agreement between the SeaWiFS and MODIS/A sensor estimates of Chl a from year to year, there are noticeable differences and similarities in the other properties displayed in Figure 1. The time series of the tracer-based mixing length
(calculated using
) has a seasonal cycle similar to the
and appears to be in phase with the log-transformed chlorophyll a. The seasonal cycle of the horizontal gradient has a different magnitude depending whether one considers the north-south or the east-west direction for both SeaWiFS and MODIS/A (north-south generally being larger than east-west for both instruments). Further, the gradient seasonal cycle is out of phase with the
signal, with the gradient lagging the Chl a peak by a month or two. Estimates of resolved standard deviation (
) from SeaWiFS and MODIS/A agree well at this location, with seasonality peaking in agreement with surface chlorophyll. This demonstrates the strength of the geostatistical technique in isolating the mesoscale geophysical signal. The unresolved variability (Figure 1e) estimates from MODIS/A are lower than SeaWiFS estimates following the aforementioned greater expected noise in SeaWiFS data (section 2.3), assuming the geophysical noise is the same for both instruments. Table 1 shows the time series average summary of the MODIS/A key variables for select time series sites from the northern and southern hemisphere including Ocean Observatory Initiative (OOI) sites (http://oceanobservatories.org/).
BATS | HOT | NAAMES | Station PAPA | Irminger Sea | Argentine Basin | Southern Ocean | |
---|---|---|---|---|---|---|---|
(31.7°N, 64.2°W) | (22.8°N, 158°W) | (45°N, 40°W) | (50°N, 145°W) | (60°N, 39°W) | (42°S, 42°W) | (55°S, 90°W) | |
log10(Chl)a | −1.07 | −1.23 | −0.66 | −0.55 | −0.46 | −0.39 | −0.99 |
n | 96 | 96 | 96 | 96 | 88 | 96 | 88 |
Seasonal amplitude | 0.55 | 0.18 | 0.52 | 0.09 | 0.62 | 0.37 | 0.21 |
Phasing (month) | 3 | 1 | 5 | 9 | 7 | 2 | 1 |
N-S unresolved variabilityb | 0.061 | 0.063 | 0.051 | 0.056 | 0.059 | 0.062 | 0.049 |
n | 94 | 96 | 93 | 92 | 69 | 95 | 71 |
Seasonal amplitude | 0.03 | 0.03 | 0.06 | 0.05 | 0.03 | 0.04 | 0.04 |
Phasing (month) | 7 | 5 | 4 | 1 | 6 | 11 | 9 |
E-W unresolved variabilityb | 0.067 | 0.075 | 0.049 | 0.051 | 0.055 | 0.058 | 0.032 |
n | 95 | 94 | 94 | 94 | 71 | 92 | 72 |
Seasonal amplitude | 0.04 | 0.03 | 0.05 | 0.05 | 0.03 | 0.05 | 0.05 |
Phasing (month) | 7 | 4 | 5 | 1 | 2 | 12 | 9 |
N-S resolved variabilityb | 0.16 | 0.16 | 0.13 | 0.12 | 0.13 | 0.18 | 0.09 |
n | 91 | 94 | 91 | 89 | 69 | 94 | 67 |
Seasonal amplitude | 0.12 | 0.06 | 0.16 | 0.10 | 0.12 | 0.22 | 0.10 |
Phasing (month) | 4 | 7 | 5 | 10 | 8 | 12 | 1 |
E-W resolved variabilityb | 0.15 | 0.16 | 0.13 | 0.12 | 0.14 | 0.16 | 0.074 |
n | 93 | 93 | 94 | 92 | 69 | 92 | 69 |
Seasonal amplitude | 0.12 | 0.06 | 0.14 | 0.08 | 0.08 | 0.25 | 0.08 |
Phasing (month) | 4 | 5 | 5 | 10 | 7 | 12 | 1 |
N-S rangec | 82.4 | 63.3 | 74.0 | 66.8 | 60.8 | 73.5 | 55.0 |
n | 93 | 93 | 95 | 91 | 71 | 94 | 70 |
Seasonal amplitude | 33.4 | 21.8 | 27.6 | 39.9 | 14.5 | 42.1 | 46.8 |
Phasing (month) | 9 | 10 | 1 | 2 | 3 | 3 | 9 |
E-W rangec | 81.8 | 82.4 | 65.6 | 58.6 | 51.4 | 72.4 | 52.4 |
n | 93 | 94 | 93 | 94 | 69 | 94 | 72 |
Seasonal amplitude | 98.2 | 57.1 | 27.8 | 25.0 | 23.1 | 50.6 | 30.5 |
Phasing (month) | 9 | 2 | 8 | 1 | 2 | 3 | 11 |
L
![]() |
113.0 | 206.0 | 110.5 | 94.6 | 99.5 | 136.3 | 91.1 |
n | 94 | 95 | 91 | 87 | 71 | 94 | 70 |
Seasonal amplitude | 69.3 | 81.6 | 59.0 | 65.5 | 36.1 | 109.1 | 51.3 |
Phasing (month) | 8 | 6 | 5 | 10 | 9 | 12 | 9 |
Gradient magnituded | 0.074 | 0.038 | 0.074 | 0.077 | 0.082 | 0.089 | 0.060 |
n | 96 | 96 | 96 | 95 | 72 | 96 | 73 |
Seasonal amplitude | 0.082 | 0.014 | 0.064 | 0.063 | 0.054 | 0.110 | 0.066 |
Phasing (month) | 4 | 9 | 6 | 1 | 8 | 2 | 1 |
- a
In units of
.
- b Unresolved and resolved variability are in units of coefficient of variation (equation A6).
- c
N-S and E-W ranges and L
are in units of kilometers.
- d
Gradient magnitude is in units of
km.
3.2 Local Temporal Relationships
The relationships between geostatistical variability as arithmetic standard deviation (
) for the NAAMES site and mean chlorophyll a become apparent when the time series in Figure 1 are collapsed into scatter plots (Figure 2). The monthly climatologies (solid dots) are also shown to emphasize the seasonal behavior of the relationships versus the additional scatter introduced by interannual variability. Lines of constant coefficient of variation (CV) are plotted by combining equation A6 with equation A5 (making allowance for conversion between base-10 and natural logarithms). The monthly resolved variability data points (open circles) and seasonal monthly climatologies for both instruments lie in a single group. At Chl a concentrations below
mg(Chl)/m3 they group around a line of constant CV, but jump to a line of higher constant CV at higher Chl a concentrations. The unresolved variability ranges with a CV from about 3% to 10% (Figure 2a) and the resolved variability with a CV from about 10% to 30% (Figure 2b). There is a divergence in behavior between SeaWiFS and MODIS/A in Figure 2a, which we will return to in our discussion of global relationships (section 3.3).

Log-log plots of geostatistical variability as arithmetic standard deviation (Chl(mg)/m3) plotted against logarithm of chlorophyll a (
) for the
cell time series along the NAAMES program cruise track (45°N and 40°W). On these axes, lines of constant coefficient of variation plot as straight lines (black dashed lines) and bracket the data in terms of relative variability. Open symbols for MODIS/A (red, N = 92) and SeaWiFS (blue, N = 145) are monthly values and solid symbols (N = 12) are monthly climatologies for same satellites. (a) Unresolved variability as the arithmetic standard deviation (
) defined as the square root of equation (A5) (mg Chl/m3) versus
. Figure 2b same as Figure 2a for resolved variability (
) versus
.
3.3 Global Patterns
In an attempt to deconvolve the underlying causes of the seasonal and interannual variability observed in Figures 1 and 2, global spatial patterns of geostatistical parameters are examined. Figure 3 displays geographic patterns, in
cells, of variance weighted, long-term means of unresolved variability (
), and resolved variability (
) of the monthly time series (SeaWiFS 1998–2010 and MODIS/A 2003–2010). The patterns shown here are very similar to the patterns found in Doney et al. (2003) even though the N-S and E-W results have been averaged together in Figure 3 for each instrument. Immediately apparent are the much lower unresolved variability values for MODIS/A (Figure 3b) versus SeaWiFS (Figure 3a). This difference is especially noticeable in the oligotrophic gyres where the general background chlorophyll a concentrations are low. We interpret this difference as due, in part, to a greater number of bits recorded by the MODIS/A detector than SeaWiFS (12 versus 10 bit quantization) allowing MODIS/A to distinguish more effectively between noise and signal at low Chl a concentrations (see section 2.3). Regardless, the pattern of elevated unresolved variability in the center of the basin scale gyres (areas of low Chl a) is clearly visible for both instruments in both the unresolved (Figures 3a and 3b) and, to a lesser degree, as a bleed-over into the resolved variability (Figures 3c and 3d). This pattern implies an inverse relationship between the underlying chlorophyll concentrations and the relative level of uncertainty with respect to the unresolved variability at low-chlorophyll levels in SeaWiFS.

Global distribution maps of resolved and unresolved variability for both SeaWiFS and MODIS/Aqua. The values displayed here are the average of the north-south and east-west results (see text for further discussion). (a) The unresolved variability derived from SeaWiFS data (as
). (b) The unresolved variability from MODIS/A data (as
). (c) The resolved variability derived from SeaWiFS data (as
). (d) The resolved variability from MODIS/A data (as
). Note that the unresolved variability from MODIS/A data (b) is much lower than SeaWiFS yet the resolved variability is still slightly elevated in the oligotrophic gyres in the same pattern as SeaWiFS (c).
Perhaps nowhere else are the differences between SeaWiFS and MODIS/A and the attributes of our analysis better shown than in the comparisons displayed in Figure 4. Similar to Figure 2, dashed lines of constant CV are drawn on the plot to show relative variation, however, in Figure 4 all of the 5° cell, long-term means are plotted as points with those poleward of 60° latitude as open circles. We assume there is a true geophysical variability signal (variance) that is scale dependent spanning from submesoscale through mesoscale. Choosing a sampling resolution of 9 km (SMI grid) partitions that true variability signal into two components, resolved (>9 km or mesoscale) and unresolved (
km or submesoscale). Figure 4a shows that increase in the measurement value adds additional noise (more variance) to the unresolved component with a magnitude dependent on characteristics of the satellite sensor data product noise. As in Figure 2a, the scatter of data points for unresolved variability tends to range with a CV from ∼3% to ∼10% for both instruments. However, at low Chl a concentrations (below ∼0.18 mg Chl/m3) the behavior of the two instruments diverge. While MODIS/A relative variability of unresolved error stays at or below a CV of ∼10%, SeaWiFS unresolved variability rises to a CV of ∼30%. The relationship between unresolved variability in terms of arithmetic standard deviation,
, and
(Figure 4a) shows SeaWiFS data have a larger unresolved variance at low chlorophyll. Assuming that the true geophysical submesoscale signal is the same for SeaWiFS and MODIS this implies a larger contribution of instrument-specific noise for the SeaWiFS data (section 2.3).

Log-log plots of geostatistical variability as arithmetic standard deviation plotted against logarithm of chlorophyll a (
) for the annual mean of the global collection of
cells. As in Figure 2 lines of constant coefficient of variation plot as straight lines (black dashed lines) and bracket the data in terms of relative variability. All points poleward of
are plotted as open circles for SeaWiFS (blue) and MODIS/A (red). (a) Unresolved variability as the arithmetic standard deviation (
) defined as the square root of equation (A5) (mg Chl/m3) versus
. Figure 4b same as Figure 4a for resolved variability (
) versus
.
Resolved variability in terms of arithmetic standard deviation,
, plotted versus
is shown in Figure 4b and demonstrates a similarity between SeaWiFS and MODIS/A. Most resolved variability plots between the 10% and 30% coefficient of variation contours and only a small amount of divergence between the two instruments at low Chl a is apparent. Figure 4b implies that the geostatistical technique is isolating a robust measure of biophysical, mesoscale variability that is largely independent of measurement platform. Nevertheless, SeaWiFS does exhibit larger resolved variability than MODIS in the same low-chlorophyll regions as unresolved variability (compare Figure 4a with Figure 4b). In an ideal situation, both instruments should capture the same magnitude of resolved variability reflecting only the true geophysical mesoscale signal. This would hold if our geostatistical method partitioned all sources of instrument-specific noise into the unresolved component, but clearly this is not the case. The larger SeaWiFS variances in both the resolved and unresolved components suggest that our geostatistical method is not fully partitioning the larger noise levels in SeaWiFS between the unresolved and resolved components, resulting in an apparent leakage of noise from the unresolved into the resolved component. This could reflect deficiencies in low-pass filtering, the variogram approach, or that some of the noise in SeaWiFS is spatially correlated (memory effects of optical sensors, cloud ringing, stray light artifacts, etc.) as previously mentioned in section 2.3 and Doney et al. (2003), or any combination of the three. Our results apply to the variance components of SeaWiFS and MODIS/A and do not directly address possible biases in the low-frequency time or space means. Any consideration of the reasonableness of either instruments Chl a measurements, in any region, is a question best decided by the investigator using the data.
Our analysis yields insight into the underlying mechanism of patchiness creation. Building from the definition of
in equation 2, Figure 5a displays the square root of the resolved variability
, an estimate for
, for both instruments as a function of the local horizontal gradient magnitude. Figure 5a has lines of constant, operationally defined empirical mixing scale,
(equation 2) plotted as dashed lines. Inspection reveals that the data poleward of 60° (open circles) plot between 10 and 50 km, while the solid points (
) plot between 50 and 300 km, reflecting a connection of surface chlorophyll mesoscale length scales to the Rossby radius of deformation noted earlier by Doney et al. (2003). Additionally, linear regressions of
(in units of
) as a function of the local horizontal gradient magnitude (also in units of
) were done and the r2 and slope are reported. These regressions are both statistically significant at the 95% confidence level (p-value <0.0001). Since the x axis is in units of gradient per 100 km, 100 times the slope yields an estimate of the global, average
(73 km for SeaWiFS and 65 km for MODIS/A). The SeaWiFS values of
are likely larger because of the positive bias, noted above, in resolved variability in low-chlorophyll subtropical regions that also have relatively low horizontal gradient magnitude. Returning, briefly, to the NAAMES site, Figure 5b shows the monthly (open circles) and monthly seasonal climatologies (filled circles) on the same axes as Figure 5a. Note that at
N the data points all plot between ∼50 and ∼200 km, the linear regressions (also significant) imply mean
distances of 95 and 102 km for SeaWiFS and MODIS/A, respectively. Both plots demonstrate that the resolved mesoscale variability can be expressed as a function of the absolute magnitude of the local, horizontal gradient of the low-pass filtered chlorophyll field consistent with the argument that physical stirring acting on the large-scale gradient is an important factor supporting observed mesoscale variability.

The relationship between chlorophyll anomaly RMS and chlorophyll gradient displayed as the square root of the resolved variability versus the mean chlorophyll gradient magnitude. Here we estimate
with
(see section 2.2) and
by equation (5). (a) The same global set of data points as in Figure 4 with two sets of linear lines, lines of constant
(black dashed lines) and the linear regression of SeaWiFS (blue) and MODIS/A (red) data. All points poleward of 60° have open symbols. (b) The same plot as in Figure 5a, but restricted to the NAAMES data displayed in Figure 2, open symbols are monthly and filled symbols monthly seasonal climatologies.
4 Discussion
4.1 Error








Figure 4 demonstrates different aspects of this relationship for large and small Chl a concentrations. We see, in Figure 4a and equation 8, as
becomes large (
), both MODIS unresolved error and SeaWiFS unresolved error scale approximately linearly with Chl a, i.e., CV approaches a constant (
). This implies that unresolved error can be roughly modeled as the first RHS term of equation 7,
. At low Chl a, this is not true for SeaWiFS where coefficient of variation increases substantially as Chl a decreases, implying that an additional constant error term (
) is also present, i.e., as
becomes small, CV increases with the background noise
(equation 8).
4.2 Length Scales
There are two key length scales in this work, the operationally defined, tracer-based mixing scale (
) and the geostatistically derived decorrelation scale length (d). Both
and d lengths are data derived, in fact from different aspects of the same data set. They are different approaches but are not fully independent. The tracer-based mixing scale (
) is computed from estimates of resolved property anomalies and large-scale lateral gradient and is meant to represent a distance a mesoscale eddy would, or could, stir a water parcel containing, in this case, chlorophyll a anomalies. The decorrelation scale length (d) represents a data-derived estimate of the actual distance any two regional mean anomalies must be separated to ensure that they are statistically independent. The interpretation of differences between
and d has many potential challenges because of the differential impacts of physical effects and mesoscale biological variations on the two length scales. After all,
should, in theory, represent physical effects while d (range) should also capture mesoscale biological variations. In principle, it may be possible to separate physical stirring from biological effects, however, this will have to wait for future improvements in the methodology that are not fully resolved at this time. Even if
and d were exactly proportional, there is no reason the ratio should be exactly one (1). For example, choice of a different variogram model would result in a different d and hence a different proportionality. Further, this framework does not account for large regional variations in mesoscale eddy kinetic energy, the effects of which are unknown. Therefore, we caution readers to be careful not to over-interpret perceived differences between
and d; the following discussion should be considered as a scaling argument not an exact relationship.









The ratio R presents the spatial distribution of two different estimates of physical stirring derived either from structure function parameters or mean Chl a gradients. In addition to stirring, the resolved variability (
) includes contributions from mesoscale biological sources and sinks, as well as leakage (aliasing) of unresolved variability into the resolved variability component, and geophysical noise. If the ratio equals zero, within statistical uncertainties, then it is likely that stirring is a major factor in the observed resolved variability. Positive values (
) indicate areas, such as in the oligotrophic ocean, where it is possible that there is aliasing of unresolved variability into the resolved variability estimate (Figures 3c and 3d). It could be noise; it could also be real mesoscale or submesoscale variability. Positive values are also seen in the temperate North Atlantic, Southern Ocean, and Equatorial Pacific where there may be real biological factors enhancing the observed variability. Negative values (
), such as in some of the western boundary currents (North and South America and Asia), indicate regions where the gradients (Figure 6c) are quite large leading to large estimates of physical stirring. It may be that the simple scaling arguments do not hold in these complex coastal regions where transport is tied to topography and is not isotropic because of the boundary (Gruber et al., 2006, 2011).

Maps of the length parameters derived from MODIS/A (see text for a discussion of SeaWiFS data). In each image, the annual mean value for each 5° cell is plotted for the entire time series (2003–2010). (a) The range (d) in km. (b)
following equation (2) in km. Here we estimate
with
(see section 2.2) and
by equation (5). (c) The local gradient magnitude of the mean chlorophyll concentration (
) in units of
km.
To within the uncertainties noted above, the following can be stated about the mixing length scale observed in our results. Length scales computed from the two different approaches give roughly comparable results with values similar to that expected for mesoscale features (Figures 6a and 6b). Similar to the findings of Doney et al. (2003) both length scales decrease from the tropics to the poles, comparable to the Rossby radius of deformation. Regional differences occur (Figure 7) but it is unclear from where these biases arise; are they due to methodological issues or real geophysical signals?

Ratio (R) of observed resolved variability (
) to predicted variability
as explained in the text and equation (10). The white space, centered on zero, encompasses
standard deviation of R (SeaWiFS: ±0.1662 and MODIS/A:
). A single contour at the
confidence level encloses
grid cells outside 95% of all R values. (a) SeaWiFS R values, note the large positive areas that are beyond the 95% confidence limits may be due, in part, to aliasing of unresolved variability into the resolved variability estimate. (b) MODIS/A R values, note the areas of negative R in common with SeaWiFS. See text for further discussion.
5 Summary
Application of geostatistical analysis techniques allows the variability observed in ocean color imagery to be partitioned into resolved and unresolved variability and gives an estimate of how variable pixels are from each other as a function of the distance between them. This variability can be expressed in terms of a coefficient of variation that allows relative magnitudes of variability to be assessed. Also, the scale of decorrelation clearly relates to the underlying mean chlorophyll a gradient and can be expressed in terms of an idealized mesoscale tracer mixing length.
Patchiness is, by its very nature, a spatial phenomenon. Our estimates of resolved and unresolved variability as well as decorrelation scale lengths are spatial estimates resolved into monthly time series of individual 5° grid cells (or more precisely the ocean beneath that cell) of geostatistical parameters. In this paper, we have demonstrated that the mesoscale-isolated geostatistical parameters exhibit a temporal (seasonal) behavior and vary geographically. We have found that SeaWiFS and MODIS/A resolve approximately equal variance. SeaWiFS data have an unresolved variability component that is larger than MODIS/A at low Chl a concentrations, most likely due to sensor noise characteristics. SeaWiFS and MODIS/A have resolved variability in the range of 10–30% globally, when expressed as coefficient of variation. Globally SeaWiFS and MODIS/A imply a
between 50 and 300 km for sites
latitude and smaller poleward. When comparing satellite derived ocean color data to model output and in developing data assimilation methods, the scale of the comparison (basin, mesoscale, submesoscale, etc.), both magnitude and nature of data source variability (resolved versus unresolved), and the correlation scale lengths should be carefully considered.
The reprocessing version of the ocean color Chl a data used in this study (R2010.0) is an older version of the data than is currently available (for both SeaWiFS and MODIS/A). Improvements in the current reprocessing (R2014.0) might significantly reduce the difference in unresolved variability between the two sensors, a hypothesis that should be explored in future work using the geostatistical techniques outlined here.
On a closing note, it is interesting that no time series of any geostatistical or any other of our derived parameters show any sign of a temporal trend for either instrument. Perhaps this is not surprising because the detection of any trend in ocean color based data is obscured by the shortness of the time series and the large natural interannual and decadal variability of the primary variable (Chl a). Both Henson et al. (2010) and Yoder et al. (2010) found that the SeaWiFS time series record (the longest as of 2010) was insufficient in length to differentiate a global trend from natural variability found in the time series. And although the MODIS/A time series is, as of this writing, 6 years longer, it is only 1 year longer than the SeaWiFS time series analyzed for this study. Henson et al. (2010), in particular, estimate a time series longer than 40 years will be necessary to accomplish this task. For approximately 10
years, the patterns have been rather stable and apparently unchanging, displayed either as time series or as global maps. If we were to have similar information from a time series twice as long (Callander & Mitchell, 1996; Clark et al., 2016; Santer et al., 1995), we might be able to make inferences about how these spatial patterns change or do not change with time. One thing that we can do is monitor the spatial distribution of chlorophyll in the current ocean into the future and observe any changes that may indicate a response to changing forcing factors (Boyd et al., 2015).
Acknowledgments
This work is the product of many years research generously supported by NASA's Ocean Biology and Biogeochemistry research program under grants NNG05GG30G, NNG05GR34G, NNX14AM36G, NNX15AH13G, and NNX15AE65G to D.M.G. and S.C.D. The authors are grateful to one anonymous reviewer, Carlos E. Del Castillo, and Bryan Franz for improving the manuscript. D.M.G. wishes to especially acknowledge receipt of the FFT-based variogram algorithm from Denis Marcotte, Polytechnique Montréal. All data are available from https://oceancolor.gsfc.nasa.gov/NASA's OceanColor web site supported by the Ocean Biology Processing Group (OBPG) at NASA's Goddard Space Flight Center.
Appendix A
A1. Log-Normal Variables
There are a number of references where one can read about log-normal distributions (Aitchison & Brown, 1966; Baker & Gibson, 1987; Campbell, 1995; Limpert et al., 2001). In this short appendix, we hope to provide a discussion of log-normally distributed variables from the data analyst point-of-view.


















Coefficient of variation (unitless) as a function of the arithmetic standard deviation of the log base-10 transformed data (
) as per equations (A5) and (A6), assuming original data ξ is log-normally distributed.