Detectability of an AMOC Decline in Current and Projected Climate Changes

Determining whether the Atlantic Meridional Overturning Circulation (AMOC)'s transport is in decline is challenging due to the short duration of continuous observations. To estimate how many years are needed to detect a decline, we conduct a simulation study using synthetic data that mimics an AMOC time series. The time series' characteristics are reproduced using the trend, variance, and autocorrelation coefficient of the AMOC strength at 26.5°N from 20 Coupled Model Intercomparison Project Phase 5 (CMIP5) models under the RCP8.5 future scenario, and from RAPID observations (2004–2018). Our results suggest that the 14‐year RAPID length has just entered the lower limits of the trend's “detection window” based on synthetic data generated using CMIP5 trends and variability (14–42 years; median = 24 years), but twice the length is required for detectability based on RAPID variability (29–67 years; median = 43 years). The annual RAPID trend is currently not statistically significant (−0.11 Sv yr−1, p > 0.05).


Introduction
Coupled ocean-atmosphere numerical models generally predict a decline of the Atlantic Meridional Overturning Circulation (AMOC) under the influence of anthropogenic warming in the 21st century (IPCC, 2019). The AMOC is responsible for ∼25% of the globe's total meridional heat transport at 26 • N, equivalent to 1.3 PW (1 PW = 10 15 W) (Ganachaud & Wunsch, 2003;Hall & Bryden, 1982;Johns et al., 2011;Lavin et al., 1998). Heat is absorbed in surface waters of the tropical Atlantic and carried northward to be released to the atmosphere over the northeast Atlantic. This plays a role in maintaining milder northwestern European atmospheric temperatures, compared to the global average at the same latitudes (Pohlmann et al., 2006;Rhines et al., 2008).
At 26.5 • N, the RAPID-MOCHA (Rapid Climate Change/Meridional Overturning Circulation and Heatflux Array) program (hereafter, RAPID) has continuously monitored the AMOC since 2004 (Rayner & Kanzow, 2011). Initially, these observations suggested that a decline in AMOC transport could be occurring (Robson et al., 2014;Smeed et al., 2014), and later a sustained reduced mean transport post-2008 was detected (Smeed et al., 2018). Coupled Model Intercomparison Project Phase 5 (CMIP5) data and other studies using proxy indicators suggest that a long multidecadal declining trend could occur (Caesar et al., 2018;Cheng et

10.1029/2020GL089974
Key Points: • The estimated 14-year annually averaged RAPID AMOC trend of −0.1 Sv yr −1 is not statistically significant • The detection window for a long-term AMOC decline is 14-42 years based on future scenario CMIP5 output statistical properties • Autocorrelation of annually averaged AMOC data is weak and has a limited impact on detection of a long-term decline However, based on observations there are still ongoing discussions regarding whether the AMOC is in decline (McCarthy et al., 2012(McCarthy et al., , 2015Moat et al., 2020;Roberts et al., 2014;Smeed et al., 2018).
Detection of long-term (multidecadal) trends can be affected by numerous factors, such as the length of data available, the magnitude of the trend to be detected, the degree of variability (e.g., variance and autocorrelation, or short-term memory), and the trend detection technique used (Beaulieu et al., 2013;Henson et al., 2010;Tiao et al., 1990;Weatherhead et al., 1998). To explore these factors, we use the 14-year RAPID observations and 93-year CMIP5 future scenario model AMOC data which exhibit a range of trends and variability. We also compare four approaches for trend detection to assess sensitivity to the presence of autocorrelation and technique choice.
Previous studies have largely focused on time of emergence of an anthropogenically forced AMOC decline; that is, when a trend falls outside the range of its natural variability (Baehr et al., , 2008Jackson & Wood, 2020;Keller, Deutsch, et al., 2007;Keller, Kim, et al., 2007;Roberts & Palmer, 2012;Roberts et al., 2014;Santer et al., 1995;Vellinga & Wood, 2004;Williams et al., 2015). For example, Roberts et al. (2014) estimate that the −0.53 Sv yr −1 AMOC trend, assessed from the first 8 years of RAPID, takes 18 years to be significantly different from the internal variability found in the CMIP5 preindustrial control simulations.
Here, we investigate the time needed to detect an AMOC decline from a monitoring perspective. As such, in contrast to the studies mentioned above, we do not use the variance from control runs to define the natural variability. We instead focus on detecting a trend signal, which is the step before investigating whether that trend falls outside of natural variability and is anthropogenically driven.
Our aim is to provide a timely estimate of the number of years (hereafter, n * ) required to detect a long-term decline in the AMOC at 26.5 • N. We compare results using linear regressions that account for autocorrelation versus those that do not and CMIP5 variability versus RAPID variability. This latter comparison follows one of key findings from Roberts et al. (2014) that the CMIP5 variability could be underestimated. We present these results as probability density functions which provide uncertainties ("detection windows") around our n * estimates.

Data
We use the numerical CMIP5 ensemble model output under the RCP8.5 ("business-as-usual") future scenario. Monthly means from 2006 to 2099 are used from the 20 models we select (Table S1 in the supporting information; the data are available from the portal Earth System Grid-Center for Enabling Technologies, on https://esgf-node.llnl.gov/search/cmip5/). To reduce model-associated biases, only one model per institute has been selected. We use the first ensemble member for each CMIP5 model used (r1i1p1, which specifies the realization, initial conditions, and physical parameter constants). For comparison to the observations, the zonally and vertically integrated CMIP5 AMOC streamfunction data are extracted at the closest latitude to 26.5 • N and the depth at which the maximum streamfunction is found.
The observations from the RAPID transbasin array at 26.5 • N are estimated as a twice-daily time series (McCarthy et al., 2015) from April 2004 to September 2018 (14.5 years; available from https://rapid.ac.uk). The RAPID AMOC time series is averaged to monthly means, then both CMIP5 and RAPID data are averaged to annual means from March to February, leading to a singular value for each year ( Figure 1). The explanation for using a March start month can be found in section 4.1. The CMIP5 annual time series are therefore 93 years long (from March 2006 to February 2099).

Trend Detection Methods
Throughout this study, the term "linear trend" refers to the slope of a straight line. Four linear regression methods are used to evaluate the impact of autocorrelation on the n * estimate. These include (1) ordinary least squares (OLS), (2) prewhitening (PW), (3) OLS with the effective number of degrees of freedom (OLSneff), and (4) generalized least squares (GLS), outlined from sections 3.1 to 3.4 below. All methods start with a simple linear trend model equation as a function of time, stated as: where Y t are the data (annual means of AMOC transport [1 Sv = 10 6 m 3 s −1 ]), is the trend magnitude, is the constant term (or intercept), N t are the residuals, t is time, and n is the sample size (i.e., corresponding to the total number of years of the time series Y t ). Classic linear regression models, such as OLS, assume the errors or residuals to be random and independent (white noise) and can be represented as t . This assumption can be violated in climate time series data, where successive data points are often correlated due to short-term memory in the system (Harris et al., 2019;Tiao et al., 1990;Weatherhead et al., 1998). It is standard practice to take into account temporal dependence in climate change detection studies (Hartmann et al., 2013). The other three methods used in this study (PW, OLSneff, and GLS) account for autocorrelation by assuming a first-order autoregressive process [AR(1)]. Lastly, following Smeed et al.'s (2018) detection of a shift in the RAPID data mean strength in 2008, we also compare long-term trends to changepoint models to verify which models best describe the CMIP5 and RAPID AMOC time series (Text S3, using the R package EnvCpt described in Beaulieu & Killick, 2018).

Ordinary Least Squares (OLS)
Following Equation 1 above, the simplest method to estimate the trend, , is OLS: where the⋅ represents an estimate and the ⋅ represents the mean. The white noise residuals are then: The unbiased estimate of the white noise variance ( 2 ) is derived from Equation 3: where the denominator is n − 2 since 2 degrees of freedom out of the original n are spent on fitting the two parameters ( and ).

Prewhitening (PW)
Prewhitening is a transformation to filter out the autocorrelation without removing the trend signal (Cochrane & Orcutt, 1949). The OLS regression is first fitted and then the AR(1) coefficient, , is estimated: The prewhitening phase then occurs: OLS is then once again applied to the transformed time series (Y ′ t ).

OLS With the Effective Number of Degrees of Freedom (OLSneff)
In the presence of autocorrelation, the number of independent observations is reduced. As such, one can account for autocorrelation in the residuals using OLS, by adjusting for the effective number of degrees of freedom (Von Storch & Zwiers, 2003): where n eff is the effective sample size. The estimate of the variance in Equation 4 is therefore adjusted using n eff − 2 instead of n − 2 in the denominator.

Generalized Least Squares (GLS)
Once again, the starting point is the same linear regression as Equation 1, written in matrix notation as: Now e is treated as a random vector from the multivariate normal distribution N(0, V), where V is a covariance matrix. In order to estimate the linear regression coefficient, the covariance matrix must be incorporated into the previous OLS estimate in Equation 2, as follows: where X ′ reflects the transpose of X (defined through the classical Euclidian scalar product). The covariance matrix has the effect of scaling the errors and hence "decorrelating" them (a detailed description can be found in Draper & Smith, 1998or Brockwell & Davis, 2002. Here, it is assumed that V is a covariance matrix of an AR(1) process: Since the covariance matrix is unknown, the 2 and from the initial OLS residuals in Equations 4 and 5, respectively, are used as approximations. The R package nlme is used to estimate the GLS trend (Pinheiro & Bates, 2000).

Synthetic Series
We define a "year" of data as an average of monthly means from March to February. The RAPID AMOC time series parameters (the trend, variance and AR(1) coefficient) are sensitive to the start month chosen to generate annual means ( Figure S2). A March start month is chosen for three reasons: (1) the three RAPID parameter values are close to their respective averages (computed using the 12 start months; Figure S2), (2) it allows for conservation of data, producing 14 years of RAPID data, and (3) CMIP5 model parameters are not very sensitive to start months ( Figure S3).
The three parameters are then extracted from the 93-year-long numerical model simulations or 14-year-long RAPID observations. As described in section 3, the parameters are (1) the trend magnitude ( ), (2) the white noise variance ( 2 ), and (3) AR(1) coefficient ( ). Two sets of simulations are then generated from these parameters. The first set uses the three parameters from the CMIP5 models only (hereafter, simCMIP5var)  Table S1). Panels (a) and (b) represent the trend (ω) and AR(1) coefficient ( ) distributions, respectively, which follow a truncated Johnson distribution and panel (c) represents the variance ( 2 ) distribution which follows a truncated inverse Gaussian distribution. The coefficients describing the shape of each curve are also displayed, top-left of each subplot. and the second combines the CMIP5 trends with the RAPID variance and AR(1) coefficient (hereafter, simRAPIDvar).
For simCMIP5var, the distribution of the 20 CMIP5 values for each parameter is used to calculate empirical cumulative distribution functions (eCDF) of , 2 , and ( Figure 2). In order to randomly generate plausible values from CMIP5 distributions, we select parametric cumulative distribution functions (pCDFs) that best fit the eCDFs. A Johnson distribution (Johnson, 1949) best fits the and eCDFs and an inverse Gaussian distribution (Wald, 1944) best fits the 2 eCDF (the coefficients to describe each pCDF are found in Figure  2 and Text S4). The pCDFs are truncated such that their tails do not extend beyond the minimum and maximum value within the CMIP5 ranges. In order to generate synthetic time series simulations, random values are selected 1,000 times from each parameters' truncated pCDF. These drawn values are therefore used to produce 1,000 synthetic simulations of the AMOC transport.
For simRAPIDvar, the same randomly picked CMIP5-based trend values as in simCMIP5var are used (from the pCDF in Figure 2a). The difference here is that all 1,000 simulations use RAPID's white noise variance (2.33 Sv 2 ) and AR(1) coefficient (0.29). This simRAPIDvar setup follows suggestions from previous studies that AMOC variability is underestimated in CMIP5 models on interannual timescales (McCarthy et al., 2012;Roberts et al., 2014).

n * Approximation
Trends are estimated from the simulated time series as a function of sample size. For each of the 1,000 randomly selected combination of parameters, 100 time series are generated for different sample sizes (8 through to 100 years), in order to ensure independence throughout the trend detection method. For each sample of simulated data, we use the four regression models, described in section 3 and identify whether the trend is significantly different from zero. Note that since model projections agree on a decreasing trend, the hypothesis tested here is only a negative trend (i.e., one-tailed test). A statistically significant declining trend with 95% confidence is identified when < 0 and p < 0.05. The proportion of the 100 time series generated for each sample size with significant declining trends is counted. Three different powers of detection are used in this study, meaning that the minimum sample size with significant trends in 80%, 90%, or 95% of the 100 time series is identified as the length of data required to detect a declining trend (n * ). This is repeated for each of the 1,000 simulations, and the results are represented as a probability density function (PDF); and summarized as a median n * and the 5th-95th percentile interval representing the uncertainty, which we also call a "detection window."

10.1029/2020GL089974
To assess the impact of changes of parameters more generally, we also include the Weatherhead et al. (1998) approximation which estimates n * in monthly time series as a function of the noise standard deviation, the trend magnitude, and the first-order autocorrelation: where is a unitless constant that depends on the critical level and the power of detection.

Results
The 20 CMIP5 parameter values range from −0.02 to −0.12 Sv yr −1 for , 0.32 to 1.29 Sv 2 for 2 , and −0.17 to 0.48 for (see Table S1 for each model's values and results from the Durbin and Watson (1950) and Engle (1982) residual analyses; Texts S1 and S2). A model comparison between linear trend and changepoint models (Beaulieu & Killick, 2018;Killick et al., 2012) suggests that the best model fit for 10 of the 20 model time series is a trend with first-order autocorrelation (see Table S2). Though this is the most commonly selected model, there lacks a unanimous agreement from the information criterion results of the other 10 CMIP5 models. This serves as a motivation to verify the impact of accounting for autocorrelation on trend detection of an AMOC decline.
The n * PDFs (Figure 3) indicate the minimum number of years required to detect a statistically significant long-term decline in the AMOC (using a 5% critical level  Figures 3a-3c). The ranges presented in brackets after the median values are the 5th-95th percentiles (or "detection window"). Results from the OLS analysis produce the lowest n * estimate; 18 [10-32], 21 , and 23 [14-41] years, respectively. These results are expected since autocorrelation in a time series can be confounded with a linear trend, hence n * increases when using a regression method that accounts for short-term memory (Beaulieu et al., 2013;Harris et al., 2019;Weatherhead et al., 1998). The other two regression methods, OLSneff and PW, produce median n * values above the OLS median and below the GLS median results. When considering the uncertainty of these results, the four regression methods produce median n * estimates that are within the detection window of all GLS results, suggesting that accounting for autocorrelation does not have a large impact on annual AMOC trend detection.
For simRAPIDvar, the 14-year annual trend from RAPID of −0.11 Sv yr −1 is not used, first, because it is not significantly different from zero (p > 0.05, Table S1), and second, because this study's aim is to estimate the n * for a long-term (multidecadal) decline scenario. Here, compared to simCMIP5var, the median GLS n * values from the PDFs increase by 18-20 years  years and 47  years for an 80%, 90%, and 95% power of detection, respectively). Again, the OLS, OLSneff, and PW median n * results fall within the GLS detection window. Comparing results from simCMIP5var and simRAPIDvar indicates that, as expected, n * increases with an increase in variability or noise; RAPID's variance is almost twofold larger than the largest CMIP5 value (MPI-ESM-LR; Table S1).
The Weatherhead et al. (1998) n* approximation in Equation 10, also supports that the number of years necessary to detect a trend is influenced by the three terms: the trend magnitude, the standard deviation, and the strength of autocorrelation. When using a critical level of 5% and a power of detection of 90%, the constant, , is on the order of 4.1 (or 3.6 for a 95% detection and 4.5 for an 80% detection). With the same ranges of trend magnitude and autocorrelation coefficient, a change in the noise standard deviation impacts the detection timescale. Taking, for example, the median CMIP5 trend of −0.06 Sv yr −1 , the median CMIP5 variance of 0.6 Sv 2 , and the median CMIP5 autocorrelation coefficient of 0.2, we obtain an n * of 26 years. If we now replace the median variance by a variance of 2.3 Sv 2 , as observed in the RAPID AMOC time series, n * increases to 40 years.
Based on this approximation, a positive autocorrelation increases n * by a factor of ( 1+ 1− ) 1∕3 . For example, when comparing two time series with the same trend magnitude (say −0.1 Sv yr −1 ) and a noise standard deviation of 1 Sv, the n * from a time series with no autocorrelation versus a positive autocorrelation coefficient of 0.5 increases from 19 to 27 years. These results from the approximation in Equation 10 demonstrate that n * increases with a larger variance, stronger autocorrelation, and weaker trend magnitude. Figure 3. PDFs of the minimum number of years required to detect a declining AMOC trend (n * ). Panels (a)-(c) are results from simCMIP5var (generated using 1,000 values from the pCDFs of the CMIP5 trends, AR(1) coefficients, and variances in Figure 2). Panels (d)-(f) are results from simRAPIDvar (generated with the same trends as in simCMIP5var but RAPID's AR(1) coefficient, 0.29, and variance, 2.33 Sv 2 ). The four linear regressions used are displayed: ordinary least squares ("ols") in green, OLS with effective number of degrees of freedom ("neff") in blue, prewhitening ("pw") in red, and generalized least squares ("gls") in black. All methods produce similar results, when comparing the medians of the PDFs. The power of detection increases from 80% in (a) and (d), to 90% in (b) and (e), and to 95% in (c) and (f). See section 4 for further details on the simulation experiment setup.
We also quantify the effect of different start months to compute a year of data on the n * using the approximation from Weatherhead et al. (1998). We set the trend value to the median CMIP5 trend of −0.06 Sv yr −1 (since neither start months produce a significant RAPID AMOC trend, p > 0.05), while the 12 different RAPID AMOC variance and AR(1) coefficients from the different start months are used. The n * estimates only vary between 39 and 44 years, with a median of 43 years, which coincides with the March start month n * (Table S3).

Discussion
We have estimated the number of years of AMOC transport monitoring required to detect a statistically significant long-term decline at 26.5 • N. We use synthetic simulations based on parameters from CMIP5 numerical model output and RAPID in situ observations. For a 90% power of detection, synthetic time series based on CMIP5 trends and variability (simCMIP5var) produce a median n * of 24 [14-42] years, with brackets representing the 5th-95th percentiles or "detection window". The RAPID time series length is therefore at the edge of the simCMIP5var detection window. When including RAPID's larger variability and autocorrelation (simRAPIDvar), n * increases to 43  years. This result is to be expected since a lower signal-to-noise ratio increases the trend detection time. This is corroborated with the n * approximation from Weatherhead et al. (1998); the combination of a strong trend, low white noise standard deviation and weak short-term memory produces the fastest detection time.
Previous work has accounted for the AMOC's autocorrelation by using different techniques, such as generation of ARMA simulations (Roberts et al., 2014), adjusting for the effective number of degrees of freedom (Smeed et al., 2014), and bootstrapping techniques . Here, a number of regression methods are used to compare results from trend detection approaches that do not account for autocorrelation (OLS) with a few that do (PW, OLSneff, and GLS). While n * values are smaller than GLS when using OLS (by 3-7 years, depending on the power of detection and simulation experiment setup), all n * median values are within the detection window of each other. This demonstrates that autocorrelation in the AMOC time series on an annual timescale has a limited impact on detecting its decline. Such a finding is beneficial as a baseline for future AMOC trend detection studies since our results suggest that a simple least squares regression technique is sufficient to capture the signal. We do not rule out, however, that the decorrelation timescale of the AMOC could be less than a year and hence using a daily or monthly mean transport may change these results.
The n * estimates are dependent on the simulation experiment setup, and therefore careful choices have been made to best replicate the AMOC's statistical properties (i.e., trend, variance and AR(1) coefficient), especially for simRAPIDvar. The 14-year RAPID AMOC trend (−0.11 Sv yr −1 ) is not used since it is not significantly different from zero (p > 0.05). This is contrary to the fivefold larger trend magnitude estimated from 2004 to 2012 (−0.53 Sv yr −1 , p < 0.01) by Roberts et al. (2014). This difference can potentially be explained by the fact that the minimum AMOC transport strength recorded in 2009-2010(McCarthy et al., 2012 induces a stronger 8-year trend, and the slight increased mean transport post-2012 (Smeed et al., 2018) weakens the 14-year trend. RAPID AMOC variability is used in simRAPIDvar to account for the potential underestimated variability in CMIP5 time series on interannual timescales (McCarthy et al., 2012;Roberts et al., 2014). Finally, we choose to characterize the overall noise in the system (the variance and autocorrelation) from the same time series (RAPID) as opposed to the Roberts et al. (2014) approach which uses the CMIP5 AR(1) coefficients and the white noise variability from the RAPID data.
Although the RAPID AMOC parameters (trend, variance and AR(1) coefficient) are sensitive to the start month used for annual averaging ( Figure S2), which could be due to large subannual variability (Hirschi et al., 2007), n * is not affected. This is shown quantitatively by the Weatherhead et al. (1998) n * approximation ( Table S3) that produces a range of only 5 years (from 39 to 44 years) using different start months. This could be due to the variance and AR(1) coefficient values showing opposing behaviors as a function of start month ( Figure S2), and hence having a minor impact on the overall noise term and n * . The March start month has been used here which coincides with the median Weatherhead et al. (1998) n * of 43 years (Table S3).
Several assumptions were made in this study that may impact the results. First, the parameters estimated from the models and observations are assumed to be stationary and therefore constant through time. However, this assumption seems reasonable since the Engle-ARCH test (Text S2 and Table S1) shows that 19 of the 20 models have a constant variability through time. As for the trend itself, even though we do not rule out that it may change, we do not currently have evidence for a changing trend to include it in our simulation experiment. The presence of long-term memory in the AMOC time series could affect our results since long-term memory will tend to increase trend detection time (e.g., Ludescher et al., 2017). However, our study suggests that the annual AMOC time series (from CMIP5 models and RAPID) displays short-term memory, if any.

Conclusion
Due to the numerous potential impacts of a decline in the AMOC transport, detecting whether its trend is significant is timely. Here, our main findings are threefold. First, the longest continuous observations available (14 years) from the annual RAPID time series do not exhibit a significant trend. Second, statistically significant AMOC trend detection timescales using the variability from CMIP5 under the RCP8.5 scenario result in the 14-year length just entering the n * detection window (14-42 years) and the inclusion of the larger RAPID variability means that at least 29 years are required. And finally, autocorrelation does not significantly affect trend detection of a multidecadal AMOC decline.

Data Availability Statement
The CMIP5 data can be found from the portal Earth System Grid-Center for Enabling Technologies (ESG-CET) at this site (https://esgf-node.llnl.gov/search/cmip5/). Data from the RAPID AMOC monitoring project are funded by the Natural Environment Research Council and are freely available at this site (https://rapid.ac.uk).