Effectiveness of CMIP5 Decadal Experiments for Interannual Rainfall Prediction Over Australia
Abstract
Phase five of the Coupled Model Intercomparison Project enabled a range of decadal modeling experiments where climate models were initialized with observations and allowed to evolve freely for 10–30 years. However, climate models struggle to realistically simulate rainfall and the skill of rainfall prediction in decadal experiments is poor. Here, we examine how predictions of sea surface temperature anomaly (SSTA) indices from Coupled Model Intercomparison Project Phase 5 decadal experiments can provide skillful rainfall forecasts at interannual timescales for Australia. Forecasts of commonly used SSTA indices relevant to Australian seasonal rainfall are derived from decadal hindcasts of six different climate models and corrected for model drift. The corrected indices are then combined to form a multimodel ensemble. The resultant forecasts are used as predictors in a statistical rainfall model developed in this study. As SSTA forecasts lose skill with increasing lead time, a new methodology for predicting interannual rainfall is proposed. We allow our statistical prediction model to evolve with lead time while accounting for the loss of skill in SSTA forecasts instead of using one statistical model for all lead times. Results in this pilot study across two of the largest climate zones in Australia show that SSTA outputs from the decadal experiments provide enhanced skill in rainfall prediction over using the conventional model (based purely on lagged observed indices) up to a maximum of three years ahead. This methodology could be used more broadly for other regions around the world where rainfall variability is known to have strong links to ocean temperatures.
Key Points
- A novel model combination approach is developed and applied to Australian rainfall prediction using CMIP5 decadal experiment outputs
- The method statistically combines independent models based on both empirical (lagged) and dynamical (concurrent) relationships
- Significant improvements in rainfall forecasting skill over using just the empirical models are noted out to a lead of 3 years
1 Introduction
Reliable and accurate forecasting of seasonal rainfall is of high value and, thus, has attracted considerable research efforts in recent years. Many different systems have been developed and used by meteorology departments around the world to issue operational forecasts. These forecasting systems are based on dynamical models, statistical models, or a combination of both. Dynamical systems predict rainfall using numerical weather prediction models or climate forecasting models. Krishnamurthy et al. (2019) used the models from the Geophysical Fluid Dynamics Laboratory for forecasting summer rainfall over the Intra-Americas Sea and found that the high-resolution atmosphere-only model forced with observed sea surface temperature anomaly (SSTA) led to the best simulation of the Caribbean low-level jet and hence the highest forecast skill. Where high-resolution simulations are not available, regional climate models nested in coupled general circulation models (GCMs) have been used for seasonal rainfall forecasting studies over Vietnam (Phan-Van et al., 2018), West Africa (Siegmund et al., 2015), China (Yuan et al., 2012), and globally (Liu et al., 2014). Statistical systems employ relationships between large-scale climate predictors and rainfall to issue such forecasts. For instance, the Indian Meteorological Department issues long-range forecasts of the Indian summer monsoon rainfall using SSTA and surface pressure anomalies over the North Atlantic and the tropical Indian and Pacific Oceans as predictors alongside land surface air temperature anomalies over Europe and north central Pacific zonal wind anomalies (Rajeevan et al., 2007). Extremes in Indian summer monsoon rainfall have also been linked to and predicted with useful skill using indices of the El Niño–Southern Oscillation (ENSO) and the Indian Ocean Dipole (IOD; Krishnaswamy et al., 2014; Rajagopalan & Molnar, 2014). More recently, B Wang et al. (2015) suggested that including new predictors such as central Pacific ENSO, Asian low, and spring North and South Pacific Highs could significantly improve the Indian Monsoon forecast skill. Similar statistical forecasting studies have also been undertaken for seasonal rainfall over Thailand (Babel et al., 2017), East Asian Summer Monsoon over China (Xing et al., 2016), and West Africa (Sittichok et al., 2016).
- Calibration: GCMs are used to obtain raw rainfall forecasts, which are then bias corrected based on previous skill between forecasts and observations. This method is good for seasons and regions over which the rainfall forecasts are known to be skillful.
- Bridging: GCMs are used to obtain forecasts of large-scale climate indices (CIs; based on SSTA indices, winds, etc.), which are then passed through a statistical rainfall prediction model. This method is good for seasons and regions over which rainfall is known to be strongly linked to CIs. The statistical model is based on a Bayesian Joint Probability scheme (BJP, Q J Wang et al., 2009), which forms individual regression models between rainfall and concurrent GCM SSTA indices (Schepen et al., 2012) and then combines these regression models using a Bayesian Model Averaging (BMA) procedure (Q J Wang et al., 2012) to forecast seasonal rainfall.
- Merging: The dynamical (calibration) and statistical (bridging) rainfall forecasts are combined using BMA.
Rainfall variability in Australia has long been linked to large-scale changes in the atmospheric circulation patterns driven by temperature changes over the Pacific and Indian Oceans (Ashok et al., 2003; Chiew et al., 1998; Drosdowsky, 1993b; Kirono et al., 2010; Nicholls, 1984b; Ramsay et al., 2008; Risbey et al., 2009; Taschetto et al., 2009; Ummenhofer et al., 2011). The positive (negative) phase of the ENSO leads to drying (wetting) over large parts of Australia (Power et al., 2006; G Wang & Hendon, 2007). IOD, the difference in SSTA between the central Indian Ocean and the Maritime Continent, has been shown to affect winter and spring rainfall (Ashok et al., 2003; Cai et al., 2011; Verdon & Franks, 2005), and linked to droughts over Australia (Ummenhofer et al., 2009). The El Niño Modoki, characterized by warm SSTA over central tropical Pacific flanked by colder SSTAs to the east and west, has been shown to affect winter rainfall and wind anomalies over Australia (Ashok et al., 2007; Ashok et al., 2009), and has been linked to shorter and more intense Australian monsoon season (Taschetto et al., 2009). These indices also affect climate in other parts of the planet. For instance, ENSO affects rainfall over the southwest of the United States (Andrade & Sellers, 1988), central Asia (Mariotti, 2007), winter rainfall over Europe (Zanchettin et al., 2008), and leads to global climate impacts (McPhaden et al., 2006). The El Niño Modoki affects the climate over South America, New Zealand, Japan, and India alongside Australia (Ashok et al., 2007). The IOD influences precipitation over India (Ashok et al., 2001), East Asia (Guan & Yamagata, 2003), and Africa (Black, 2005; Ummenhofer et al., 2009) besides Australia. In addition to these tropical Pacific and Indian Ocean climate patterns, Australian rainfall has also been linked to SSTAs over the South Tasman Sea (Drosdowsky, 1993a) and the Indonesian Seas (Nicholls, 1984a). Schepen et al. (2012) showed these indices to be useful predictors of Australian rainfall, and a subset of these indices are used for operational forecasting by the BoM. More recently, the tropical transbasin variability (TBV) index (Chikamoto et al., 2015), a dipole between the tropical Pacific and the tropical Atlantic, has been shown to significantly affect Australian rainfall (Choudhury, Sen Gupta, et al., 2016; Johnson et al., 2018). Moreover, the TBV has been reported to have a multiyear predictability, which is much longer than those associated with other such SSTA indices (e.g., ENSO; Chikamoto et al., 2015; Choudhury, Sen Gupta, et al., 2016). Chikamoto et al. (2015) suggest the longer-term predictability of TBV compared to ENSO to primarily arise from coupled feedback processes between the Atlantic, Indian, and Pacific Oceans via Walker circulation displacements. Chikamoto et al. (2015) used partially coupled experiments to show the importance of Atlantic SSTAs in modulating Pacific variability. They suggest that the coupled Indian-Atlantic-Pacific system, described by the TBV index, exhibits longer-term variability than the tropical Pacific alone leading to longer-term predictability besides the higher predictability expected from TBV's larger spatial extent. However, a detailed explanation for how this happens has yet to be provided.
A set of experiments was conducted as part of the Phase 5 of the Coupled Model Intercomparison Project (CMIP5) to bridge the gap between seasonal forecasting systems and climate projections (Kirtman et al., 2013). These “near-term” or “decadal” experiments are a collaboration between the World Climate Research Programme and the Working Group on Seasonal to Interannual Prediction and provide simulations over a timescale of 10–30 years. Forecasts at decadal timescales would be beneficial for many industries, including agriculture, water supply, energy, and fisheries. These experiments are initialized with the observed ocean state, like seasonal predictions, and also account for changes in external forcings, such as greenhouse gases, aerosols, and solar activity, ignored in seasonal forecasts. To isolate the predictable part of the climate response from the unpredictable short-term climate variability, multiple simulations (called ensembles) of the same model with slightly perturbed initial conditions are carried out. To evaluate the merit of such experiments, multiple modeling groups have run hindcast experiments initialized every year or every 5 years from the year 1960 to 2005.
GCMs are known to have significant difficulties in realistically simulating rainfall (Collins et al., 2011; Lin, 2007; Perkins et al., 2007; Stephens et al., 2010; Sun et al., 2006). This is in part because precipitation is strongly affected by local processes that occur over short timescales, which climate models struggle to reproduce. However, studies suggest that GCMs are better at forecasting sea surface temperature at timescales longer than the annual timeframe, including from the decadal prediction experiments (Chikamoto et al., 2015; Choudhury et al., 2015; Corti et al., 2012; Gonzalez & Goddard, 2016; Mehta et al., 2013). This can be attributed to the fact that SSTA, especially over regions linked to large-scale climate phenomena like ENSO, generally have a longer memory and evolve more slowly. It should be noted that representation of these large-scale climate patterns in climate models may contain biases that might result in changes in predictability (Guemas et al., 2012; Meehl & Teng, 2012). Despite this, using forecasts of SSTA indices as predictors of rainfall may prove more skillful than using rainfall forecasts directly. Previous studies have shown simple statistical rainfall forecasting systems based on large-scale climate patterns can outperform rainfall forecasted using sophisticated weather prediction models (Rajeevan et al., 2007; B Wang et al., 2015; Westra & Sharma, 2010; Wilks, 2011). For Australia, the merit of using SSTA fields driven by a multimodel combination approach (Kaiser Khan et al., 2014) over using a single SSTA field to issue seasonal rainfall forecasts has been previously documented by Khan et al. (2015). The “bridging” component of the operational forecasting system of BoM is such a statistical model based on rainfall-CI relationships.
A challenge in the above approach, however, is accounting for the reduced skill of predicted SSTA indices as lead times increase. This temporal decay of prediction skill applies also to decadal experiments, additionally due to the presence of climate drift that introduces significant systematic biases in the variables simulated (Mehrotra et al., 2014). A new approach that accounts for model drift, the deterioration of forecast skill with lead time, and the fact that the same predictability is not achievable for all climate variables of interest, is needed. In view of this, the present study aims to introduce a novel methodology for forecasting rainfall (or any other variable) that has low direct predictability (because it is poorly simulated in models) by using a suite of available information while letting the statistical prediction model to evolve temporally. The model is referred to as a hierarchical linear combination (HLC) model and is used to forecast rainfall at seasonal to interannual timescales using SSTA indices as predictors. Two sets of predictor indices are considered: observed indices available at the start of the forecast and concurrent predicted SSTA indices derived from the decadal experiments. The rationale being that multiple SSTA predictors are necessary in order to obtain the best possible rainfall forecast, including observed indices (which provide skill based on a lagged relationship with rainfall) and concurrent modeled indices (which provide skill where there is a concurrent relationship between an index and rainfall). The relationships between observed rainfall and observed SSTA and observed rainfall and predicted SSTA are used to predict monthly rainfall in a leave-one-out cross-validation framework over the period of 1960–2010. This predicted rainfall is then compared with the observed rainfall to evaluate the skill of the HLC model. The prediction skill obtained is compared with the case where the concurrent modeled SSTA indices are not considered (which provides the lower limit of predictability) and where modeled SSTA indices are replaced with concurrent observed SSTA indices (representing the scenario where the decadal predictions are “perfect,” which provides the upper limit of predictability). Using this novel but simplified version of the existing operational statistical model, this study aims to quantify the merit in including SSTA predictions from the CMIP5 decadal experiments in Australian rainfall forecasting at an interannual timescale.
2 Models and Methods
2.1 SST and Rainfall Data
Decadal hindcasts integrated between 1960 and 2010 from five CMIP5 models are considered in this study. These models are CanCM4 (i1), GFDL-CM2.1, MIROC5, MPI-ESM-LR, and HadCM3 (i2 and i3). While data from other models are available at the CMIP5 data archive, only these models provided annual initializations at the time of the commencement of the study while other models provided decadal hindcast data initialized every 5 years over 1960–2010. Choudhury, Sharma, et al., 2016 reported that five yearly initialized decadal hindcast experiments could contain spurious imprinting of the observed ENSO signal during 1960–2010 leading to biased results over the tropical Pacific even after drift correction. The selected model decadal hindcasts are initialized every year from 1960 to 2000, thus making a total of 41 overlapping decadal projections, and encompass both full-field initialization (CanCM4i1, GFDL-CM2.1, and HadCM3i3) and anomaly initialization (MIROC5, MPI-ESM-LR, and HadCM3i2) methods. The models have a minimum of 3 and maximum of 10 ensemble members for each year (Table 1). The SSTA indices (defined in the next section) derived from the ensemble mean of each of the 41 forecasts of each model are drift corrected, using a leave-one-out cross-validation scheme. The standard drift correction method involves estimating the lead time-dependent mean drift, which is simply the average of all but one hindcast ensemble biases, and subtracting it from each of the ensembles (ICPO, 2011). However, here we use the best drift correction method for each index and model following the results of Choudhury et al. (2017). Most predicted indices are corrected either by the initial condition based drift correction method (Fučkar et al., 2014) or the trend based drift correction method (Kharin et al., 2012). These methods account for additional dependence on the observed initial conditions or additional time-dependent trend adjustment terms over the standard mean drift correction method. A thorough description of these drift correction methods and their effects on SSTA can be found in Choudhury et al. (2017). Indices from all the models are then combined to form the multimodel ensemble mean (MME). Choudhury et al. (2017) demonstrated that drift correcting individual models prior to averaging leads to the most skillful forecasts. Figure S1 in the supporting information shows the impact of this drift correction. CIs for each decade from each of the model are drift corrected and then averaged to form the mean time series of each index from the drift corrected MME (blue). This is compared with the case where individual predictions are not drift corrected before averaging to form the MME (Raw, black) and the mean time series from the observations (Obs, red). Drift correcting individual models and averaging leads to a much better match with the observed decadal time series than the raw values.
Model | Group | Initialization method | Ensemble Size | Historical Run Period |
---|---|---|---|---|
MIROC5 | Atmospheric and Ocean Research Institute (AORI), National Institute for Environmental Studies (NIES), and Japan Agency for Marine-Earth Science and Technology (JAMSTEC), Japan. | Anomaly initialization | 6 | 1850/01 to 2005/12 |
CanCM4 (i1) | Canadian Centre for Climate Modelling and Analysis (CCCma), Canada. | Full-field initialization | 10 | 1961/01 to 2005/12 |
HadCM3 (i2 and i3) | Met Office Hadley Centre, United Kingdom |
i2 = Anomaly initialization i3 = Full-field initialization |
10 | 1859/12 to 2005/12 |
GFDL-CM2p1 | National Oceanic and Atmospheric Administration (NOAA), Geophysical Fluid Dynamics Laboratory (GFDL), USA | Full-field initialization | 10 | 1861/01 to 2040/12 |
MPI-ESM-LR | Max Plank Institute of Meteorology (MPI-M), Germany | Anomaly initialization | 3 | 1850/01 to 2005/12 |
Observed SSTA data are obtained from the Hadley Centre Global Sea Ice and Sea Surface Temperature (HadISST) data set (Rayner, 2003).
Observed gridded monthly rainfall over Australia during 1961–2010 are obtained from the Australian Water Availability Project (Jones et al., 2009), at a 0.5° × 0.5° resolution.
2.2 SSTA Indices
The choice of SSTA indices here is governed by two factors. The first factor is the relationship each index has with seasonal rainfall on a concurrent basis. The second factor is the predictability each index exhibits based on CMIP5 decadal hindcasts. An SSTA index with low predictability but high correlation with rainfall is of limited use in forecasting for the long lead times of interest. This is a key reason why dynamical precipitation simulations are of limited value in formulating seasonal precipitation forecasts, as their distinction from noise disappears within a few months from initialization (Mehrotra et al., 2014).
Ten SSTA CIs that have been shown to have a significant relationship with Australian rainfall are considered as potential predictors of Australian rainfall. They are Niño 3, Niño 4, Niño 3.4, the El Niño Modoki index (EMI), the Dipole Mode index (DMI), the Indian Ocean East Pole Index (EPI), the Indian Ocean West Pole Index (WPI), the Indonesian Index (II), the Tasman Sea Index (TSI), and the TBV (see Table 2 for details). The first nine indices have been used as a basis for operational seasonal forecasts (Schepen et al., 2012), while the TBV has recently been shown to have significant relationships with Australian rainfall by Choudhury, Sen Gupta, et al. (2016). Many of these indices are also useful predictors for rainfall in other parts of the world. The anomalies for the observations (HadISST) are calculated with respect to a 1871–2015 climatology and that for the decadal experiments is calculated from their respective historical runs (usually 1860 to 2005).
Climate index | Description | Group |
---|---|---|
Niño 3 | Average SSTA over 5°N to 5°S, 150–90°W | Pacific |
Niño 4 | Average SSTA over 5°N to 5°S, 160°E to 150°W | Pacific |
Niño 3.4 | Average SSTA over 5°N to 5°S, 170–120°W | Pacific |
EMI (C) | Average SSTA over 10°N to 10°S, 165°E to 140°W | Pacific |
EMI (E) | Average SSTA over 5°N to 15°S, 110–70°W | Pacific |
EMI (W) | Average SSTA over 20°N to 10°S, 125–145°E | Pacific |
EMI (El Niño Modoki Index) | EMI (C) − 0.5 (EMI (W) + EMI (E)) | Pacific |
WPI (Indian Ocean West Pole Index) | Average SSTA over 10°N to 10°S, 50–70°E | Indian |
EPI (Indian Ocean East Pole Index) | Average SSTA over 0°N to 10°S, 90–110°E | Indian |
DMI (Dipole Mode index) | WPI-EPI | Indian |
II (Indonesian index) | Average SSTA over 0°N to 10°S, 120–130°E | Indian |
TSI (Tasman Sea index) | Average SSTA over 30–40°S, 150–160°E | Extratropical |
TBV (A) | Average SSTA over 15°N to 15°S, 40°W to 60°E | Atlantic-Indian |
TBV (P) | Average SSTA over 15°N to 15°S, 180–150°W | Pacific |
TBV (Tropical Transbasin Variability Index) | TBV (P) − TBV (A) | Pacific-Atlantic Transbasin |
- Note. See Schepen et al. (2012), Chikamoto et al. (2015), and Choudhury, Sen Gupta, et al. (2016) for details. The indices are calculated from monthly mean SSTA. The anomalies for HadISST are calculated with respect to 1871–2015, and that for the decadal experiments is calculated from their respective historical runs (Table 1). SSTA = sea surface temperature anomaly.
The predictability of decadal hindcast indices was ascertained using random skill as a reference, separately for the first nine indices using six CMIP5 GCMs (Choudhury et al., 2015) and for TBV using a single GCM with multiple (10) ensembles (Chikamoto et al., 2015). An assessment of this predictability is reproduced in Figure 1, with significant predictability associated with TBV up to 36 months from initialization. This high predictability of TBV could be from the larger spatial extent of TBV and/or because of the connection between the Pacific and Atlantic Oceans. The high correlation between Australian rainfall and the TBV in certain regions (Choudhury, Sen Gupta, et al., 2016; Johnson et al., 2018) offers the potential for useful rainfall forecasts that extend beyond normal seasonal timescales. The other SSTA indices exhibit shorter predictive timescale with most becoming equivalent to “noise” within the first 9 months. This is consistent with past research on the limited ability of climate models to cross the “spring barrier” in predicting the El Nino Southern Oscillation (Ruiz et al., 2005).
2.3 HLC Model
The HLC is a simplified version of the “bridging” (BJP-BMA) method, which is the statistical approach used operationally by the Australian BoM to forecast seasonal rainfall from CIs (Schepen & Wang, 2015). The HLC hierarchically combines linear regression models of predictands (such as rainfall) and predictors (such as CIs), using weights representative of the error-covariance in each regression model, in a double (nested) leave-one-out cross-validation framework. Unlike a multiple linear regression model, HLC uses individual linear regression models of the predictors and the predictand while the combining weights are calculated based on the errors in each of these models (equation 2). Given the limited amount of data available, it is imperative to use a strict cross-validation scheme to obtain robust results. We assume that the values of rainfall and climate indices are all known at the start of the forecast (Lead 0) and then progressively forecast the rainfall values at subsequent lead times using “n” individual linear regressions for each of the “n” CI predictors (CIs, called “xi”). These are then combined using weights representative of the errors in each of these linear regressions, ascertained based on the covariance of the errors across all linear regressions developed. Use of the error-covariance as the basis for this combination approach ensures common information in the models being combined does not get overweighted. For each grid point and season, rainfall at a particular lead time (say, “l”) is assumed to depend on the initial observed CIs (at lead “0,” xi,0) and the concurrent MME CIs (at lead l, xi,l). A schematic of this rainfall prediction model is shown in Figure 2, followed by a detailed explanation of the process.
- The components relevant to the dynamical model are denoted by d and use concurrent predictors at lead l.
- The components relevant to the empirical model are denoted by e and use lagged predictors at Lead 0.
- The predictors from the observations have “Obs” as a subscript while those from the decadal multimodel ensemble have “MME.”
- As mentioned in section 2.1, a total of 41 overlapping decades (1960–1970, 1961–1971, … , 2000–2010) are considered in this study.
- As a first step, both rainfall and CIs are transformed using the extended form of the Box-Cox transform, introduced by Yeo and Johnson (2000), following the “bridging” approach of Q J Wang et al. (2009). They have shown this method of normalizing the data to perform best for Australian rainfall, with the approach being an integral part of the seasonal forecasting procedure adopted by BoM. The transformation coefficient for the rainfall is noted. The discussion henceforth assumes normalized variables are used as predictors and predictands. Similarly, the predictors and predictands represented in equations 2 and 3 earlier denote such transformed variables.
- For each grid point and lead time (months), leave one decadal forecast initialized in a specified year out for validation, denoting this as “val1,” and use the remaining forecasts representing the rest of the initializations, for training, with a notation “tr1.” The val1 is the decade that would finally be predicted using this model, and thus, we will have no information about its rainfall value, .
- As a result, there are 40 × 1 training rainfall values ( , 40 being the number of years when decadal forecasts are issued after one forecast decade is left out for cross validation), 40 × ne training initial observed CIs ( , ne equalling 10 in the present study and representing the number of empirical, observed, predictors) and 40 × nd training concurrent (drift corrected) dynamical multimodel-averaged CIs ( , nd equalling 6 in this case and representing the SSTA indices that had longer predictability horizons (Choudhury et al., 2015)). Please also note that the observed CIs are used at the initialization time step (denoted 0) while the dynamical CIs are used at the concurrent time step the forecasts are being made for (denoted l). Hence, the empirical case represents lagged observed information while the dynamical case represents concurrent predicted information in the predictor variables.
- Next perform the following sequential operations:
- Form individual linear regressions between the ne +nd (16) predictors and predictand in a second leave-one-out cross-validation scheme, by leaving one additional forecast out for validation (“val”) and using the other 39 as the training data (“tr”). Notice the difference to “val1” and “tr1”.
- 1. Ascertain the linear regression coefficients, (2 × 16), between (39 × 1) with the 16 predictors:
- Use coefficients on the validation set of predictors, to get (ne + nd) estimates of the validation rainfall (1 × 16) and hence the residual error terms (1 × 16):
- Repeat step 5a for all possible 40 cross-validation cases (∀ val ∈ tr1) to have the final error estimate (40 × 16):
- Calculate the weights, “ ” (16 × 1), by minimizing the mean squared error loss using the covariance matrix of these errors, , as per (Timmermann, 2006)
- Next, fit individual regressions of the entire (40 × 1) response vector with (40 × 10) and (40 × 6), to get the new regression coefficients, (2 × 16, notice difference to (m,c)tr).
- Then predict the originally left forecast ( ) using (2 × 16) on the 16 individual linear regressions from observed CI values at the start of forecast ( , 1 × 10) and the concurrent CIs from MME ( , 1 x 6) for that particular decade.
- Combine these independent regressions using from Step 5c to get the predicted rainfall per lead time l and per grid point. This estimate is denoted as :
- Use the transformation coefficient from Step 2 to inversely transform the predicted rainfall values to get the final rainfall forecast for each month and grid point.
While the rainfall forecasting model is run for each month, results are presented as annual rainfall means, since the study is focused on investigating rainfall prediction at interannual timescales.
- : Predicting rainfall using the same HLC method, but only considering the initial observed SSTA values at the start of the forecast ( ) as predictors. Without any information about concurrent forecast SSTA indices, this involves predicting rainfall based on lagged SSTA-rainfall relationships and serves as the lower bound on predictability.
- : Predicting rainfall using the same HLC method but considering concurrent observed SSTA indices instead of the concurrent forecast indices alongside lagged observed SSTA indices, . This indicates the skill that could be achieved if our decadal forecasts were perfect and serves as the upper limit of our prediction ability.
It should be noted that and have 16 predictors, while has 10 predictors. It should also be noted that the HLC approach ensures greater parsimony than using a multivariate regression model with as many predictor variables. This is because individual regression models are fitted for each predictor variable, and model weights only modulate the relative importance of each such model, ensuring that the final prediction represents the same level of stability as the individual regression models do.
3 Analysis, Results, and Discussion
In this study, primary results over Queensland (QLD) and the Northern Monsoon Region (NMR), which independently occupy the largest areas (21% and 20%, respectively) among the 10 Australian climate regions, as defined by Timbal et al., 2008 (Figure 3), are presented. These two regions have relatively high rainfall and a strong seasonality and are important for agricultural production. Additionally, these regions have been shown to have a strong relationship with the tropical Indo-Pacific SST, by Kaiser Khan et al. (2014). These two regions were also shown to have the highest skill in predicting next season rainfall based on lagged climate indices using the BMA approach (bridging component of BoM forecasting model) by Q J Wang and Robertson (2011). The performance of the different prediction models ( , , ) is assessed using correlation between the observed and predicted rainfall over the regions. The aim of the current study is not to develop the best rainfall prediction model but to quantify the merit of using CMIP5 decadal outputs in rainfall prediction and estimate the improvement limits that can be expected if the decadal prediction experiments were perfect.
We expect to have its highest skill at short lead times and that the skill would reduce with time as the lagged relationship between the indices and rainfall diminishes (Figure 2). would have the highest skill among the three prediction models and should maintain a high skill score at all lead times. We would expect to lie between these two extreme cases of and . Specifically, would (i) have similar skill to (and ) at the start of the forecast, since the GCM forecasts should have their highest skill; (ii) decay with increasing lead time, while still being better than and lower than ; and (iii) eventually become statistically indistinguishable from as the GCM predictions lose skill and stop adding any useful information into the prediction model. At higher leads, we would expect more predictors with low weights (since the weights need to sum up to 1) and no preference between the empirical and dynamical predictors as neither would provide useful information.
Figure 4 shows the areal mean correlation skills of the three prediction models ( , , ) for annual mean rainfall over QLD and NMR. The gray dashed line marks the 90% significant correlation value corresponding the 39 degrees of freedom (0.257, since we have 41 decades in total). starts with a correlation skill centered around 0.4 and 0.3 for QLD and NMR, respectively, which drops to close to zero with increasing lead time. shows significant correlation with observed rainfall only for the lead of 1 year for both regions. shows significant skill for all five leads. The skill of should ideally not change with lead time since the concurrent relationships between observed predictors and rainfall is independent of time. However, factors unrelated to the various predictors that affect rainfall could become more or less important over time, which could lead to such small deviations in the prediction skill. The skill of is between 0.42 and 0.52 for QLD and between 0.3 and 0.4 for NMR. This potential skill of rainfall forecasts is greater for QLD than NMR because of a stronger statistical relationship between QLD rainfall and CIs compared to NMR rainfall and CIs (Khan et al., 2015).
starts with correlations between and (because of the high skill of GCMs at short lead times), starts deteriorating, and eventually gets similar to the . This also shows that we are losing skill in our dynamical predictors even at a lead of 1 year. The biggest improvements of using over are noticed for the early leads of 2 and 3 years, since the climate models are still skillful enough to add information over just using the lagged observed indices. The biggest drop in skill is noticed for the leads of 3 to 4. This could be because none of the predicted indices add any skill to the prediction model. The correlation of remains significant till year 3, despite the lagged observations ( ) providing significant information for only the first year for QLD. This indicates that the dynamical MME predictors are adding value for additional 2 years. This is not the case for NMR. Both and show significant correlations only for the first year and then become insignificant. At higher leads, the skill of should collapse down to since there is no predictability of the CIs at leads of 4 and 5 years. However, our results show the skill of is higher than even for leads 4 and 5, although the highest predictability of the indices is seen for TBV at 36 months. This could be because a combination of indices is providing skill out to longer lead times based on unknown physical links in the climate system or simply because of the higher number of predictors in compared to . Nevertheless, the skill of is insignificantly around 0.1 and, hence, this issue is not investigated further.
Figures 5 and 6 shows the spatial distribution of the improvement in correlation skill from with respect to that from . Hatchings mark the regions where the improvements in correlation skill are significant (at a 90% level based on 39 degrees of freedom, ). Grid cells in the north west of NMR that had missing rainfall values in certain decades have been left out of the analysis. Both regions show the lowest fraction of significant improvements for Lead 1. For QLD, the areas that show improvements increase from Lead 1 to Lead 3 and then reduce again. However, the share of area that improves significantly is around 20% for the leads of 2–4 years. The largest improvements are for Lead 3 where 87% of the area improves with 26% showing significant improvements over . There also seems to be no consistent spatial pattern for the areas that improve, although quality of the observed rainfall data might contribute to uncertainties in the improvements (Viney & Bates, 2004). For NMR, the largest improvement is noticed for Lead 3 (87% of the area), although the 65% of the area shows significant improvements for Lead 2. This is important because although the areal mean correlation of over NMR for Lead 2 is insignificant (Figure 4b), 65% of the region shows significant improvement (Figure 5b). This suggests that if a smaller domain in the northern part of NMR were examined, enhanced predictability could have been seen for the second year (unlike Figure 4b). To compare these results with the concurrent observations-based model, Figures S2 and S3 in the supporting information show the spatial improvement in correlation skill from with respect to that from . Since Lead 2, shows significant improvement over around 70% of the area for both regions. Again the area that improves is higher for QLD than NMR, which can be attributed to the stronger statistical relationships between CIs and QLD rainfall (Khan et al., 2015). Figures S2 and S3 represent the value that can be added by decadal prediction experiments using the current rainfall prediction model if they were perfect.
Figures 7 and 8 show the weights of the predictors (CIs) which are more important (weights > 0.0625, mean for 16 predictors) from for different lead years for QLD and NMR. The number of predictors that have high weights and their rankings change with lead time to represent the predictors that provide most information to the prediction model. Information from other predictors may be small or might have already been available from a different predictor. However, since the current prediction system demands the weights to always sum up to 1, predictors still get weights at higher leads even though they might be adding very little information to the system. Generally, results show that the fraction of CIs with higher weights from the MME increases with lead time while the distinctions between the two sets of predictors being the weakest for leads of 1 and 4 years. With increasing lead time, MME predictors with higher predictability (Figure 1) become more important for both regions. For QLD (Figure 7), both lagged observed CIs and concurrent MME predictors get high weights at the lead of 1 year, although the fraction of lagged CIs is higher. With increasing lead time, the proportion of predictors that are from concurrent MME ( ) increases, though at least the two strongest predictors are always from the lagged observed CIs ( . For NMR (Figure 8), Lead 1 shows the expected result of lagged observed indices providing the most information into the prediction model. The strongest predictors from the lagged observed CIs ( are DMI, TSI, II and EMI. Schepen et al. (2012) showed EMI to be the best predictor for northern NMR and QLD for the MAM season and southern NMR and QLD for the SON season while II, DMI, and TSI to be the best for DJF season at monthly lags over northern Australia. Although, SON rainfall was strongly related to the ENSO indices. The best performing predictors from the concurrent model CIs ( are TBV, EMI, and WPI. These indices show the best match with average decadal timeseries of the observations in Figure S1. Also, these three indices were shown to have the highest predictability from the current set of models used (Figure 1; Choudhury et al., 2015). However, the ENSO indices from the concurrent MME CIs also show up as important predictors for certain leads in both regions. This could be because of the lagged observed CIs contributing even lesser information to the prediction model. For comparison, Figures S4 and S5 in the supporting information show the weights of the predictors from the concurrent observations-based model ( ). From the concurrent observed indices ( , EMI and TBV show the highest weights for all lead years. Next, Niño 4 appears to be the strongest concurrent predictor out of the ENSO indices. This can be expected from Risbey et al. (2009), who showed northern and northeastern Australia to have the highest correlation with Niño 4 among the ENSO indices. From the lagged indices ( , DMI, TSI and II still show up as important lagged predictors for QLD and NMR suggesting the weights in being robust.
4 Conclusions
There has been substantial research investigating predictions at the decadal timescale primarily because of the socioeconomic benefits it could provide. Various modeling groups have run such decadal experiments and shared their outputs for researchers to investigate. While the decadal experiments show promising results, predictions of rainfall remain poor compared to that of other variables. Given the practical importance of seasonal and longer-term rainfall forecasts, this study describes an alternate way of using decadal experiments for rainfall predictions over Australia. A hierarchical model combination approach is used to combine regression models based on SSTA indices to obtain the best rainfall forecast. The model combines regressions based on both observed SSTA indices and the decadal predictions of SSTA indices using weights that are allowed to evolve freely over time penalizing predictors with weaker relationships to rainfall. To estimate the full potential of the current model, concurrent decadal SST outputs are also replaced with observed SST values to determine the upper limit of predictability that can be obtained from the current setup (were our decadal models able to provide perfect forecasts). This study presents an application of this method for interannual rainfall prediction over Australia.
Simulations of relevant SSTA indices from the decadal experiments were used as predictors within a simplified version of the operational seasonal rainfall prediction model used by the Australian BoM, to investigate if they would add merit beyond using lagged observed SSTA indices. Analyses over two of the largest climate zones over Australia (Queensland and the Northern Monsoon Region) show the current rainfall prediction model (that includes concurrent SST predictions from the CMIP5 decadal experiments) to be significantly skillful (at a 90% level) out to a maximum of 3 years for QLD and 1 year for NMR. Although at higher leads most grid cells of the regions showed improvements over just using lagged indices as predictors, only around 20% show significant improvements. Thus, these results in their current form may not warrant usefulness in hydrologic planning. By comparison, if our decadal models could provide perfect forecasts significant improvements would be expected over 70% of the regions for all leads.
It is also important to note that the current prediction model's ability to predict rainfall at longer timescales relies heavily on indices with high interannual predictability, such as TBV. While the connections between the Pacific and Atlantic Oceans are becoming increasingly clearer (McGregor et al., 2018), the mechanisms giving rise to the longer predictability are not fully understood. The Atlantic-Pacific teleconnection is poorly simulated in the models and the representation (and response) of TBV in the models is subject to large uncertainties. But, TBV has so far received rather limited attention and further work is needed to examine if the TBV-Australian rainfall relationship is stable through time.
The results presented in this study simply suggest that there is significant merit in adding SSTA outputs from CMIP5 decadal experiments to interannual rainfall prediction (from 1 year for NMR to 3 years for QLD). Although, significant improvements at higher leads might be possible for certain subregions within NMR. These two areas also have very distinct wet and dry seasons. The results presented here for the interannual rainfall skill may not scale down comparably for wet season rainfall prediction. Although, many climate modes are phase locked to particular seasons and thus higher skill could also be expected for certain regions during certain seasons. These results will also be updated as the next phase of decadal prediction outputs from CMIP6 become available. Alongside this, there are possibilities for refining/improving these results using a range of additional investigations, such as (a) using information from the spatial distribution of prediction skill more effectively; (b) considering other predictors that have higher memory such as warm water volume or other upper ocean heat content measures or using temporal gradients of SST indices (instead of just spatial gradients or just the indices); and (c) supplementing these results from the decadal experiments with the existing seasonal forecasts from the BoM to enhance the usability of such predictions. Further, as the archives of decadal prediction outputs increase, these results of interannual rainfall prediction can be reevaluated. The methodology presented here can used for other variables although the efficacy of the method may be different. As this method combines information from relevant predictors while favoring the more skillful ones, it can be used for variables with low predictability if there are other covarying variables that are more predictable in the decadal simulations.
Acknowledgments
The World Climate Research Program's Working Group on Coupled Modelling is responsible for CMIP. We thank the climate modeling groups for producing and making available their model outputs used in our study (see Table 1, accessible online through http://pcmdi9.llnl.gov/esgf-web-fe/). For CMIP, the U.S. Department of Energy's Program for Climate Model Diagnosis and Intercomparison provides coordinating support and led development of software infrastructure in partnership with the Global Organization for Earth System Science Portals. This research was undertaken with the assistance of resources from the National Computational Infrastructure (NCI), which is supported by the Australian Government. We acknowledge the Australian Research Council (ARC) for funding this research. We also acknowledge the support and encouragement from the Australian Bureau of Meteorology and the New South Wales State Water Corporation. Bellie Sivakumar also acknowledges the support from the Australian Research Council (ARC) through the Future Fellowship Grant (FT110100328). We would like to thank the three reviewers and the associate editor for their comments, which have helped improve the overall quality of the manuscript.