Representing Nitrogen, Phosphorus, and Carbon Interactions in the E3SM Land Model: Development and Global Benchmarking
Abstract
Over the past several decades, the land modeling community has recognized the importance of nutrient regulation on the global terrestrial carbon cycle. Implementations of nutrient limitation in land models are diverse, varying from applying simple empirical down-regulation of potential gross primary productivity under nutrient deficit conditions to more mechanistic treatments. In this study, we introduce a new approach to model multinutrient (nitrogen [N] and phosphorus [P]) limitations in the Energy Exascale Earth System Model (E3SM) Land Model version 1 (ELMv1-ECA). The development is grounded on (1) advances in representing multiple-consumer, multiple-nutrient competition; (2) a generic dynamic allocation scheme based on water, N, P, and light availability; (3) flexible plant CNP stoichiometry; (4) prognostic treatment of N and P constraints on several carbon cycle processes; and (5) global data sets of plant physiological traits. Through benchmarking the model against best knowledge of global plant and soil carbon pools and fluxes, we show that our implementation of nutrient constraints on the present-day carbon cycle is robust at the global scale. Compared with predecessor versions, ELMv1-ECA better predicts global-scale gross primary productivity, ecosystem respiration, leaf area index, vegetation biomass, soil carbon stocks, evapotranspiration, N2O emissions, and NO3- leaching. Factorial experiments indicate that representing the phosphorus cycle improves modeled carbon fluxes, while considering dynamic allocation improves modeled carbon stock density. We also highlight the value of using the International Land Model Benchmarking (ILAMB) package to evaluate and document performance during model development.
Key Points
- We developed a new carbon-nitrogen-phosphorus land model (ELMv1-ECA) integrated in the E3SM Earth system model
- We benchmarked the simulated present-day carbon cycle using the International Land Model Benchmarking package (ILAMB)
- We documented model performance and identified necessary future improvements
1 Introduction
Anthropogenic CO2 emissions since the preindustrial era have increased atmospheric CO2 concentrations to levels that the Earth has not experienced for hundreds of thousands of years (Petit et al., 1999). Such enhanced atmospheric CO2 dramatically alters the global energy balance via a positive radiative forcing effect (Myhre et al., 1998). Terrestrial ecosystems currently remove from the atmosphere about a quarter of anthropogenic CO2 emissions (Houghton, 2007; Le Quéré et al., 2016) via photosynthesis, thus reducing the warming effects on climate. Furthermore, terrestrial ecosystems have the potential to sequester CO2 more efficiently in the future due to the CO2 fertilization effect (Mao et al., 2016; Norby et al., 2005).
However, this potential benefit from terrestrial ecosystems may be limited because a larger terrestrial CO2 sink requires a greater supply of essential nutrients, particularly nitrogen (N) and phosphorus (P) (Falkowski et al., 2000). Although human activities have enhanced global N and P availability (e.g., from agricultural fertilization and fossil fuel combustion; Galloway et al., 2004; Vitousek et al., 1997), at the global scale these increases are unlikely to satisfy terrestrial ecosystem nutrient demands (Hungate et al., 2003; Wieder et al., 2015). More importantly, enhanced N and P availability is imbalanced (Penuelas et al., 2013), with a potential to shift many ecosystems toward a more P limited status.
With decades of observational studies and multiple lines of evidence, it is now generally agreed that current terrestrial plant growth is often nutrient limited, and these nutrient limitations will be exacerbated under higher atmospheric CO2 levels (Hungate et al., 2003; Koerselman & Meuleman, 1996; Wieder et al., 2015; Xia & Wan, 2008). For example, global meta-analyses of N and P fertilization experiments (Elser et al., 2007) revealed widely distributed N and P colimitation on plant productivity in a wide range of terrestrial ecosystems.
Free Air CO2 Enrichment (FACE) experiments also have provided insights into possible exacerbation of terrestrial nutrient limitation under anticipated higher atmospheric CO2 concentrations. Under elevated atmospheric CO2, plants are stimulated to increase tissue construction but require more nutrients to support this higher productivity (Finzi et al., 2007). From a resources supply point of view, a system will shift to a status more limited by other resources when the supply of one resource is continuously enhanced (F S Chapin et al., 1987). Across many FACE experiments, the observed CO2 fertilization effect on plant growth is sustained for only a few years and then diminishes, because the systems shift to more nutrient limited conditions (Norby et al., 2010; Reich et al., 2006; Reich & Hobbie, 2013).
Although they provide characterization of ecosystem nutrient limitation, empirical experiments cover only a small fraction of global ecosystems, making it difficult to delineate large-scale nutrient limitation patterns. To quantify global impacts of nutrient limitations on the carbon cycle, process-based land models are needed to extrapolate site-level mechanistic understanding to larger scales (Norby et al., 2016). Prevailing Earth System Model (ESM) land models (e.g., those participating in Coupled Model Intercomparison Project Phase 5 [CMIP5; Taylor et al., 2012] and CMIP6 (Eyring et al., 2016)), all consider nitrogen cycles (Gerber et al., 2010; Goll et al., 2012; Koven et al., 2013; Wang et al., 2010; Xu & Prentice, 2008; Yang et al., 2014; Zaehle & Dalmonech, 2011; Zhu & Zhuang, 2013). Some of them also consider the phosphorus cycle (Goll et al., 2012; Wang et al., 2010; Yang et al., 2014). Although these models were built with the concept that nutrient deficiency limits the carbon cycle, large uncertainties stem from how nutrient limitations are implemented (Medlyn et al., 2016; Tang & Riley, 2018; Zaehle et al., 2014).
In this study, we describe the integration of N and P dynamics in the Energy Exascale Earth System Model (E3SM) Land Model version 1 (ELMv1-ECA) and evaluate the model's performance against a wide range of global-scale observations. Compared to its predecessor (i.e., the Community Land Model version 4.5 [CLM4.5]), ELMv1-ECA has several important features, including (1) the equilibrium chemistry approximation (ECA) as a robust method to resolve multiple nutrient (N and P) competition by multiple consumers (plants, microbes, and soil mineral surfaces), (2) leaf-level nutrient effects on photosynthesis, (3) a dynamic allocation scheme that balances concurrent constraints on plants, and (4) explicit representation of leaf and root traits that affect growth and nutrient competitiveness. By comparing our model with baseline versions (CLM4 and CLM4.5), we show an improvement of predictive skill by improving nitrogen constraint and introducing phosphorus constraint on the carbon cycle.
2 Materials and Methods
2.1 E3SM Land Model
The Energy Exascale Earth System Model (E3SM) is a new ESM project sponsored by the U.S. Department of Energy (DOE; Bader et al., 2014) that focuses on addressing critical questions about the global water cycle, biogeochemistry, cryosphere, and their interactions with the climate system. The E3SM land model version zero (ELMv0) originated from the CLM4.5 (CLM4.5-BGC) with vertically resolved soil biogeochemistry and a CENTURY-like soil decomposition cascade (Koven et al., 2013). Both CLM4.5-BGC and ELMv0 represent coupled carbon and nitrogen interactions, and here we describe a new representation of coupled C, N, and P dynamics being integrated in ELM (Figure 1).

2.2 Nutrient Competition in the Plant-Soil System
Plants, soil microbes, and abiotic factors (e.g., mineral surfaces) reside in the same soil media and compete for a wide range of nutrients, including those we focus on here: NH4+, NO3-, and PO43-. Because these nutrients are usually limited in availability, strong competitive interactions are expected. This section describes the methods ELMv1-ECA uses to represent plant and soil nutrient uptake and competition.
2.2.1 Plant Nutrient Uptake













2.2.2 Soil Nutrient Uptake










2.2.3 Nutrient Competition
As soil nutrient availability decreases, competitive stresses increase, particularly when the potential nutrient demands by all nutrient consumers exceed the supply in a given time step. The partitioning of limited nutrients between consumers affects their functioning. Consequently, model predictability of carbon-nutrient dynamics is sensitive to the underlying model hypotheses regarding nutrient competition. In ELMv1-ECA we adopted a multiple-consumer-multiple-substrate competition network based on the ECA theory (Tang & Riley, 2013; Zhu et al., 2016; Zhu et al., 2017; Zhu, Riley, et al., 2016). The ECA competition theory represents (1) nutrient uptake mediated by nutrient carrier enzymes, (2) binding of a nutrient substrate to a specific enzyme prevents it from binding to other enzymes, and (3) rates and affinities of consumers for the various substrates.





2.3 N and P Impacts on Carbon Dynamics
2.3.1 Plant Carbon Assimilation


where a, b, c, d, and e are regression coefficients estimated from a photosynthesis data set covering more than 300 plant species (Walker et al., 2014). These two relationships allow ELMv1-ECA to dynamically represent direct nutrient constraints on photosynthesis.
2.3.2 Carbon Allocation

2.3.3 Prognostic Plant Stoichiometry
To account for plants' plasticity in assimilating and using nutrients, ELMv1-ECA allows C:N:P stoichiometry to be flexible, varying around a baseline value and prognostically determined by leaf level carbon fixation versus root nutrient uptake. The baseline C:N:P ratio is derived from the TRY database (leaf C:N:P; Kattge et al., 2009) and a recent synthesis of global fine root, sapwood, and heartwood C:N:P including more than 6,000 plant species (see Table S2 for plant functional type-specific C:N:P stoichiometry). The model stoichiometric baseline and natural variability are based on stoichiometric mean and standard deviation in the data set (Table S2). Root nutrient uptake is regulated (equations 4 and 5) so that plant tissue nutritional levels are maintained within the range of this observed natural variability, which we estimate to be 40% of the baseline value. If N or P supply were highly limited, plant biomass construction would be reduced and the excess nonstructural carbon would be stored within the plant.
2.4 Model Configuration and Benchmarking
Following the simulation protocol of CLM (Oleson et al., 2013), ELMv1-ECA was first spun up for 1,000 years with accelerated soil decomposition (Koven et al., 2013) followed by a 200-year regular spin up with regular soil decomposition to reach a steady state carbon cycle. Soil phosphorus pools were initialized from observations (Yang et al., 2013) at the beginning of the regular spin-up. The spin-up simulations were forced with repeated meteorology and constant atmospheric CO2 mole fraction (285 ppm). The model was then run in a transient simulation from 1850 to 2010 with Global Soil Wetness Project (GSWP) reanalysis forcing (Dirmeyer et al., 2006), transient CO2 concentrations, N deposition (Lamarque et al., 2005), and P deposition (Mahowald et al., 2008). Model simulations were performed at a 1.9° latitude by 2.5° longitude resolution.
Simultaneously evaluating global land model carbon pool and flux predictions is complex (Luo et al., 2012), particularly since improving one aspect of model performance can often lead to degradation in others. To address this problem, the ILAMBv2.2 (International Land Model Benchmarking; Collier et al., 2016; Hoffman et al., 2017) package was designed to evaluate land models across a wide range of observational constraints, including ecosystem states, fluxes, and functional responses. The full ILAMBv2 package includes model evaluations against observations of the carbon cycle, water cycle, energy cycle, and climate forcing at in situ, regional, and global scales. We focus here on the global patterns of carbon cycle-related benchmark metrics, including (1) gross primary productivity and total ecosystem respiration, (2) leaf area index (LAI), (3) soil carbon stock, (4) aboveground live biomass, and (5) evapotranspiration (ET). The global patterns of gross primary productivity (GPP) is upscaled from FLUNXET in situ observations (Baldocchi, 2003) using a model tree ensemble (MTE) technique (Beer et al., 2010). FLUXNET-MTE total ecosystem respiration is derived from nighttime measurements extrapolated to daytime based on calculated temperature sensitivities (Reichstein et al., 2005). The global ET benchmark is derived from the annual average of the GLEAM estimated sum of canopy and soil evaporation, transpiration, bare soil evaporation, and sublimation using multiple satellite sensor data (Miralles et al., 2011). The aboveground living biomass benchmark is based on more than 4,000 inventory plots and extrapolated to large-scale with high-resolution remote sensing imagery (Saatchi et al., 2011). We include two independent benchmarks for the top 1-m soil carbon stock: (1) NCSCD (The Northern Circumpolar Soil Carbon Database) for the northern pan-Arctic region (Hugelius et al., 2013) and (2) HWSD (Harmonized World Soil Database) for the globe (Hiederer & Köchy, 2011). We also benchmark CLM4.0-CN (Thornton et al., 2007) and CLM4.5-BGC (Koven et al., 2013) to show improvements of ELMv1-ECA modeling skills. Compared with ELMv1-ECA, CLM4.0 and CLM4.5 both consider carbon and nitrogen cycles but not phosphorus. Further, CLM4.0 and CLM4.5 employed a relative demand hypothesis for nutrient competition among plants, microbial decomposers, nitrifiers, and denitrifiers, in which different nutrient consumers take up soil available nutrients based on their gross demand rather than actual capability. A major improvement of CLM4.5 over CLM4.0 was to include a vertically resolved CENTURY type soil decomposition module (Koven et al., 2013). In conducting simulations, all three models used a similar spin-up strategy (accelerated spin-up and regular spin-up), land use time series, definition of plant functional type, and global distribution of plant function types. CLM4.0 and CLM4.5 used CRUNEP climate forcing, while ELMv1-ECA used GSWP3 climate forcing.
3 Results and Discussion
3.1 Global Carbon fluxes
We first evaluate the models using the spatial distribution of long-term GPP annual means (1982-2008; Figure 2). The largest model GPP biases are in Northern American boreal forest ecosystems and lowland tropical ecosystems (overestimation) and Northern Eurasia boreal forest ecosystems (underestimation), although the globally integrated bias has been reduced compared with CLM4.0-CN and is similar to that of CLM4.5-BGC (Table 1). The spatial pattern and bias of modeled GPP have been largely reduced compared with the baseline models, resulting in ELMv1-ECA GPP having the highest score (0.78) for this benchmark (Table 1).

Mean | RMSE | Bias score | RMSE score | Seasonal cycle score | Spatial distribution score | Overall score | |
---|---|---|---|---|---|---|---|
GPP | [Pg/year] | [Pg/year] | |||||
Benchmark | 113 | ||||||
ELMv1-ECA | 126 | 55 | 0.80 | 0.70 | 0.79 | 0.94 | 0.78 |
CLM4.5-BGC | 127 | 68 | 0.73 | 0.65 | 0.79 | 0.92 | 0.75 |
CLM4.0-CN | 140 | 76 | 0.66 | 0.60 | 0.79 | 0.79 | 0.67 |
Reco | [Pg/year] | [Pg/year] | |||||
Benchmark | 90 | ||||||
ELMv1-ECA | 120 | 47 | 0.79 | 0.68 | 0.84 | 0.89 | 0.78 |
CLM4.5-BGC | 122 | 61 | 0.72 | 0.62 | 0.85 | 0.83 | 0.73 |
CLM4.0-CN | 136 | 74 | 0.65 | 0.54 | 0.81 | 0.65 | 0.64 |
LAI | [m2/m2] | [m2/m2] | |||||
Benchmark | 1.3 | ||||||
ELMv1-ECA | 1.9 | 1.0 | 0.64 | 0.55 | 0.64 | 0.81 | 0.64 |
CLM4.5-BGC | 2.3 | 1.4 | 0.51 | 0.45 | 0.62 | 0.56 | 0.52 |
CLM4.0-CN | 2.2 | 1.3 | 0.52 | 0.47 | 0.66 | 0.56 | 0.54 |
ET | [mm/day] | [mm/day] | |||||
Benchmark | 1.2 | ||||||
ELMv1-ECA | 1.2 | 0.46 | 0.84 | 0.74 | 0.81 | 0.94 | 0.81 |
CLM4.5-BGC | 1.3 | 0.57 | 0.80 | 0.68 | 0.83 | 0.92 | 0.78 |
CLM4.0-CN | 1.4 | 0.66 | 0.73 | 0.64 | 0.82 | 0.86 | 0.74 |
- Note. The bold numbers represent the best model.
- Abbreviations: CLM4.5-BGC, Community Land Model version 4.5 BGC; ELMv1-ECA, Energy Exascale Earth System Model (E3SM) Land Model version 1-equilibrium chemistry approximation; ILAMB, International Land Model Benchmarking; RMSE, root-mean-square error.
Since photosynthesis (GPP) synthesizes organic carbon using CO2 via metabolic reactions catalyzed by the temperature sensitive Rubisco enzyme, GPP is closely related to water availability, temperature, and plant ET. The observed relationships between those factors and GPP provide useful benchmarks for GPP modeling (Figure 3). As ET, temperature, and precipitation increase, observed (i.e., FLUXNET-MTE) GPP (green line) first increases, then plateaus, and finally declines due to environmental stresses (e.g., high temperature). CLM4.0-CN overpredicts GPP when ET and precipitation are high, while CLM4.5-BGC and ELMv1-ECA are comparable and more consistent with the benchmark. ELMv1-ECA best captures the observed relationship between GPP and ET, particularly when ET ranges from 2 to 4 mm/day. The three models also share some common bias features. Consistently, maximal GPP bias from all three models were around 20°C, which highlight the model deficiency in representing photosynthesis in subtropical ecosystems and an urgent need for future development and improvement for this particular ecosystem.

In terms of total ecosystem respiration (Reco), all three models shared a similar error spatial structure. That is, Reco is overestimated in most of the tropical rainforest and northern high-latitude ecosystems (Figure 4). However, the integrated error in ELMv1-ECA was much smaller than that in CLM4.5-BGC and CLM4.0-CN, with their global root-mean-square error scores being 0.68, 0.62, and 0.54, respectively (Table 1). We note that Reco and GPP from the FLUXNET-MTE product are not independent, because each was derived from the observed net ecosystem exchange by assuming that respiration is temperature dependent and that nighttime temperature sensitivity is maintained over the diurnal cycle.

Two available benchmarks for net biome production were used in this study, including the Global Carbon Project (GCP) estimate (Le Quéré et al., 2016) and Hoffman et al. (2014). We found that ELMv1-ECA agreed better with the GCP net biome production estimates than with the Hoffman et al., 2014 estimates, over the time period when ELMv1-ECA, GCP, and Hoffman et al. (2014) estimates are all available and evaluated in terms of capturing the interannual variability (Figure S1a in the supporting information) and the overall probability density distributions (Figure S1b).
3.2 Leaf Area Index
LAI, the single-sided leaf area per unit ground area, is a critical vegetation property that affects canopy-scale photosynthesis. The ILAMB LAI global benchmark is derived from the Moderate Resolution Imaging Spectroradiometer (MODIS) LAI collection 5 product (MCD15A3; Figure 5). Except for the Arctic and African semiarid regions, all three models tend to overestimate LAI. This overestimation could be due to model parameter or structural biases and/or MODIS LAI algorithm bias. Previous efforts found that although MCD15A3 has been significantly improved compared with the collection 4 product, it still underestimated maximum LAI for high LAI systems, such as mature forests (De Kauwe et al., 2011). Model-data LAI discrepancy is smallest in ELMv1-ECA, and largest in CLM4.0-CN (Figure 5). Global mean LAI bias of ELMv1-ECA (1.9 m2/m2) is ~30% lower than CLM4.0-CN and CLM4.5-BGC (2.2 and 2.3 m2/m2, respectively; Table 1). Although the gridcell level bias of ELMv1-ECA is much lower compared to the other two models, ELMv1-ECA still overestimates LAI over a large fraction of the land surface (e.g., tropics), which warrants further investigation and improvement of model performance and benchmark data.

The observed relationship between LAI and precipitation (Figure 6, green line) indicates that lowland tropical regions (precipitation over 7 mm/day) had the highest potential of developing large LAI. ELMv1-ECA LAI in the lowland tropics is about 6 m2/m2 (Figures 5 and 6), while CLM4.0-CN and CLM4.5-BGC are 8 and 10 m2/m2, respectively. Given that the satellite-derived maximum mean LAI in tropical regions is probably substantially underestimated because of cloud contamination (e.g., at a site in Tapajos Brazil, MCD15A3 indicates LAI of 4 m2/m2, which is ~2 m2/m2 lower than ground camera-based LAI measurements; Wu et al., 2016) and data algorithm saturation (Heiskanen et al., 2012), an estimate of 6-7 m2/m2 for tropical forest LAI is expected, which is consistent with the ELMv1-ECA estimate.

3.3 Evapotranspiration
ET is tightly coupled to leaf level carbon dynamics. For example, in ELMv1 a better representation of photosynthesis rate (An) will potentially improve the estimate of stomatal resistance (rs), based on the Ball-Berry conductance model (
) and thus the calculation of ET. The observed annual ET is higher in the tropics and lower in the boreal region (Figure 7). Although all three models capture the general observed spatial distribution, CLM4.0-CN and CLM4.5-BGC substantially overpredict ET in the tropics; this bias is significantly lower in ELMv1-ECA (Figure 7). All models tend to underestimate ET in the northern high latitudes, although the absolute biases are smaller than in the tropics. The global mean square error of ELMv1-ECA ET is 0.46 mm/day, significantly less than that of CLM4.5-BGC (0.57 mm/day) and CLM4.0-CN (0.66 mm/day). Further, the spatial distribution of ELMv1-ECA ET better matched observations than did CLM4.0-CN and CLM4.5-BGC (Table 1).

For precipitation between 6 and 14 mm/day (~2,000-5,000 mm/year), CLM4.0-CN predicts relatively higher ET compared to observations (Figure 8). CLM4.5-BGC and ELMv1-ECA both generally follow the observed pattern, with ELMv1-ECA having an overall lower bias. However, under low annual precipitation (<6 mm/day), all three models underpredict observed ET. The observed relationship between ET and surface air temperature is well captured by all three models (Figure 8). ET initially increases as temperature increases due to higher evaporative energy but declines at very high temperatures, due to the limitation from other factors (e.g., water availability in tropical drylands) rather than temperature.

3.4 Global Carbon Pools
Our benchmark of global carbon pools here focuses on aboveground living biomass and total soil carbon stock to 1-m depth. Highest aboveground biomass (~20 kg C/m2) was observed in the lowland tropics (Figure 9). CLM4.0-CN and CLM4.5-BGC predicted the highest biomass density in the lowland tropics, and their predictions are highly biased (biomass bias of about 50 and 30 kg C/m2, respectively). ELMv1-ECA slightly overestimated lowland tropical biomass (Figure 9 and Table 2).

Model | Period mean [Pg] | Bias score | Spatial distribution score | Overall score |
---|---|---|---|---|
Benchmark | 354 | |||
ELMv1-ECA | 431 | 0.63 | 0.84 | 0.74 |
CLM4.5-BGC | 730 | 0.43 | 0.54 | 0.48 |
CLM4.0-CN | 702 | 0.45 | 0.46 | 0.45 |
- Note. The bold numbers represent the best model.
- Abbreviations: CLM4.5-BGC, Community Land Model version 4.5 BGC; ELMv1-ECA, Energy Exascale Earth System Model (E3SM) Land Model version 1-equilibrium chemistry approximation; ILAMB, International Land Model Benchmarking.
ELMv1-ECA matched both NCSCD and HWSD data sets much better than did CLM4.0-CN or CLM4.5-BGC (Table 3). NCSCD and HWSD are two independent, and often inconsistent, observational benchmarks that may complicate model benchmarking. This problem raises the challenging question of how to harmonize different observational data sets when benchmarking models. It is also important to mention that SoilGrid1km (Hengl et al., 2014) is a third well-known global carbon stock database not yet included in ILAMB, and it differs substantially from both NCSCD and HWSD. Currently, ILAMB weights the NCSCD (45%) and HWSD (55%) metrics to produce an overall score (Collier et al., 2018).
Model | NCSCD score | HWSD score | Overall score |
---|---|---|---|
ELMv1ECA | 0.71 | 0.72 | 0.71 |
CLM4.5-BGC | 0.61 | 0.51 | 0.55 |
CLM4.0-CN | 0.29 | 0.58 | 0.45 |
- Note. The bold numbers represent the best model.
- Abbreviations: CLM4.5-BGC, Community Land Model version 4.5 BGC; ELMv1-ECA, Energy Exascale Earth System Model (E3SM) Land Model version 1-equilibrium chemistry approximation; HWSD, Harmonized World Soil Database; ILAMB, International Land Model Benchmarking; NCSCD, The Northern Circumpolar Soil Carbon Database.
We also evaluated the effects of modeled P cycle and dynamic allocation to modeled carbon pools (living biomass versus soil carbon stock). ELMv1-ECA with an imposed fixed carbon allocation scheme slightly improved LAI estimates but biased all other variables (Figure 10). In particular, using fixed carbon allocation degraded estimates of biomass and SOC. We also note that excluding the P cycle mostly biased the estimates of carbon fluxes, such as GPP and Reco, but with small effects on biomass and SOC. Overall, ELMv1-ECA showed very good ILAMB scores (greenish color in variable z-score panel of Figure S2) in almost all carbon cycle metrics. For water cycle metrics, ELMv1-ECA performed better in ET and latent heat while did worse in terrestrial water storage than CLM4.0 or CLM4.5. For the energy cycle metrics, ELMv1-ECA tended to perform relatively worse in simulating upward shortwave and longwave radiation fluxes and consequently albedo. For the climate forcing metrics, CRUNCEP and GSWP3 were generally similar (variable score panel of Figure S2), although GSWP3 was sometime better than CRUNCEP (variable z-score panel of Figure 3). In summary, ELMv1-ECA performed significantly better than CLM4.0 and CLM4.5 in most of the available metrics including carbon, water, and energy cycles.

3.5 Nutrient Cycle Evaluation
Compared to metrics for the carbon cycle, global-scale gridded benchmarks for nitrogen and phosphorus cycles are less available. In this study, we used nutrient fluxes of NO3- leaching and N2O gases loss (Houlton et al., 2015; Sinha et al., 2017; Zhu & Riley, 2015) to assess the nitrogen loss pathways of ELMv1-ECA. Using soil 15N tracer information, the fraction of gaseous (N2O) versus total N (N2O + NO3- leaching) losses (fdenit: Figure 11) was inferred based on different 15N fractionation effects (Houlton et al., 2015). The latitudinal distribution of observationally-inferred fdenit indicates that higher proportions of total N losses occur as gas in the tropics, and this proportion gradually declines toward colder ecosystems. Both CLM4.0 and CLM4.5 failed in simulating this spatial distribution of fdenit (Houlton et al., 2015), while ELMv1-ECA generally captured this latitudinal trend. For the phosphorus cycle, which also lacks a global-scale benchmark, we adopt the idea of phosphorus limitation from the leaf C:N:P stoichiometry perspective based on Wang et al. (2010) and developed a latitudinal distribution of P limitation (Figures 11b and S3). This metric is defined by leaf level relative abundance of nitrogen and phosphorus. ELMv1-ECA predictions indicate that ~60% of tropical ecosystems are more phosphorus limited (rather than N limited) and P limitation is alleviated over temperate ecosystems. Across the southern midlatitude ecosystems, P limitation indicates that these low-temperature systems are inefficient in recycling soil phosphorus although the observed total phosphorus content in high latitude soils (slightly weathered) is much higher than in tropical soils (strongly weathered; Yang et al., 2013). This modeled P limitation distribution is consistent with current understanding of the relationship between plant carbon and soil nutrient cycles at a global scale (Houlton et al., 2008; Walker et al., 2014; Wang et al., 2010).

We evaluated the spatial distribution of leaf nutrient concentrations and leaf to fine root biomass ratios, in spite of a lack of global-scale direct measurements, because they represent important aspect of the model in terms of carbon-nutrient coupling. The modeled long-term averaged leaf C to N ratio was highest in northern hemisphere tundra ecosystems, and gradually declined toward the tropical rainforest ecosystem (Figure S3). This spatial pattern reversed for leaf C to P ratio, implying that tropical ecosystems are mostly P limited. As a result of the strong P limitation over the tropics, the leaf to fine root ratio was higher over cold regions and relatively lower in tropical ecosystems (Figure S3), which generated a testable hypothesis that plants tend to allocate more carbon into fine root growth for phosphorus acquisition. Based on limited site-level estimates of forest growth in different components, the allocation into leaves is comparable to that into fine roots (Malhi et al., 2009). A comprehensive evaluation of the ELMv1-ECA dynamic allocation scheme will require more observation data.
Although lacking large-scale observational benchmarks for nutrient cycles, it is still important to document critical nutrient cycle fluxes and compare them with values reported in previous work. In ELMv1-ECA, global mean N2 fixation rate during the last five decades of the simulation (1961-2010) of 175 Tg N/year compared well with global N2 fixation estimates from the literature (100-290 Tg N/year (Cleveland et al., 1999), 138 Tg N/year (Galloway et al., 2004); Figure S4). ELMv1-ECA simulation also revealed an increasing N2 fixation trend, which could be explained by the relief of energetic constraints on nitrogenous activity by warming and stimulation by progressively stronger P limitation (Houlton et al., 2008). Global nitrogen loss through hydrological (NO3 leaching and runoff) and gaseous (N2O emissions) pathways were estimated to be 48 and 11 Tg N/year based on Galloway et al. (2004). ELMv1-ECA simulated (Figure S4) mean annual gaseous losses (10 Tg N/year) were consistent with published estimates (Galloway et al., 2004; Mosier, 1994), while ELMv1-ECA underestimated hydrological NO3 losses (17 Tg/year), partly because the agriculture land N leaching loss was not considered in the simulation. Spatially, most of the gaseous N losses occurred over tropical regions, where high biological N2 fixation occurs. This pattern implies a relatively open N cycle in tropical ecosystems. In contrast to the primary N input from biological N2 fixation (Figure S4a), the primary P input due to weathering declined over time from 1961 to 2010 due to insufficient replenishment of parent material (39 Tg P/year; Figure S5). Taken together, terrestrial ecosystems progressively became less N limited and more P limited from 1961 to 2010, and these N and P imbalances may continue or increase over the 21st century. Consistent with this argument, ELMv1-ECA showed that during the last five decades, leaf C:P ratio changed more significantly than leaf C:N ratio (Figure S6). Tropical system C:P ratio was higher than the initial base value, while nontropical system C:P ratio was similar to the initial base value. In contrast, tropical system C:N ratio was similar to the initial base value, while nontropical system C:N was much higher than the initial base value (comparing the locations of boxes and the beginning of the line with same color). The simulated P leaching losses were relatively stable, with a mean annual loss of 13 Tg P/year, which agreed with estimates from other modeling studies, for example, 14 Tg P/year in Wang et al. (2010). Moreover, ELMv1-ECA simulated a significantly lower P weathering input over tropical ecosystems due to depleted parent material (Yang et al., 2013), compared with temperate ecosystems. ELMv1-ECA also had a less prominent tropical-temperate contrast in P leaching losses, compared with P weathering.
3.6 Future Improvement Suggested by Benchmarking
Through benchmarking the ELMv1-ECA predictions, we conclude that the new model captures global patterns of major carbon-related quantities better than either of its baseline predecessor models. However, our benchmarking also reveals important biases against observations and provides information on model mechanisms (and observational deficiencies) that may be responsible for the biases.
Plants open stomata to fix atmospheric CO2 and thereby lose water to the atmosphere. The water use efficiency (WUE; i.e., the ratio between carbon fixed and water transpired) is a critical physiological characteristic important for ecosystem interactions with the atmosphere. Over the Amazon rainforest and Northern boreal forest ecosystem, ELMv1-ECA consistently overestimates GPP (Figure 2) and underestimates ET (Figure 7), leading to an overestimate of ecosystem scale WUE. This WUE overestimation implies that modeled Amazon rainforests and Northern boreal forests are unrealistically conservative in their water use.
ELMv1-ECA improved carbon cycle predictions unevenly across latitude. A relatively large improvement is found in the tropics, implying that the improved phosphorus limitation, nutrient competition, and dynamic allocation in ELMv1-ECA are mechanistically important for lowland tropical carbon dynamics (Figure 2). However, phosphorus limitation and dynamic allocation may play different roles for different aspects of the carbon cycle (Figure 10). Furthermore, a reliable GPP estimate should be consistent with reasonable LAI estimates, or a model could reproduce observed GPP by overestimating LAI and underestimating the leaf level carbon fixation rate (or vice versa). ELMv1-ECA's predicted LAI bias (Figure 5) is relatively low in the tropics, particularly given the likely underestimate of tropical MODIS LAI associated with, for example, cloud contamination and MODIS algorithm saturation (Heiskanen et al., 2012). ELMv1-ECA overestimated LAI in general, but relative biases were lower than either baseline model (Figure 5 and Table 1). We note that although the absolute bias is small, the relative GPP bias ((model minus observations)/observation) in the pan-Arctic region was similar (67.5-90°N ~14%), compared with pan-Tropical region (-23.5 S—23.5°N ~16%) and larger than temperate region (23.5—67.5°N ~4%), given that observed arctic GPP is low.
It is challenging to reasonably predict soil carbon stocks. Todd-Brown et al. (2013) showed that most of the carbon-only land models overestimated SOC stocks, while the only model in the CMIP5 analysis that represented nitrogen cycling substantially underestimated SOC stocks (CCSM4). ELMv1-ECA estimated the global SOC stock to 1-m depth to be ~1,100 Pg C, which is lower than that of CLM4.5-BGC (~1,800 Pg C) and higher than that of CLM4.0-CN (~600 Pg C; benchmark 1,270 Pg C; Todd-Brown et al., 2013). Predicted SOC stocks result from the balance between soil carbon inputs and losses through respiration (erosion and leaching are not yet represented). Although there is no global data set of litter inputs, we have higher confidence in this predicted flux compared with the base models because of the overall improvement in predicted carbon cycle metrics in ELMv1-ECA (Figures 2-6). In addition, the soil carbon turnover rates applied in ELMv1-ECA have been evaluated against some radiocarbon observations (Koven et al., 2013). In other work, we have extended this type of evaluation with a rich data set of soil radiocarbon profiles (Chen et al., 2019).
ELMv1-ECA introduced many new parameters to support model development. Although most of the parameters are directly measurable and largely constrained by existing databases (Tables S1 and S2), we acknowledge potential tradeoffs of benefits associated with more mechanistic representations versus parameter uncertainty. Future work will be focused on parameter sensitivity and uncertainty quantification.
3.7 Suggestions and Limitations
Land models have been intensively evaluated from in situ to global scales, but commonly against only a few metrics in each evaluation. We suggest that the land-modeling community will benefit from systematic benchmarking against multiple data sets during model development. A disaggregated (i.e., against individual observations) and overall (i.e., combining the evaluation into a single metric) model benchmarking approach allows a consistent and repeatable evaluation of model improvements over time. Considering a wide range of metrics also informs whether changes to a model that improve a given process representation simultaneously degrade other model predictions.
We here focus our analysis on the present-day carbon cycle; more work is needed to evaluate nitrogen (N) and phosphorus (P) cycle dynamics against observational benchmarks (Bouskill et al., 2014). We suggest several nutrient cycle benchmarks urgently needed for land model development that will be integrated in a future version of ILAMB. First, global patterns of N and P fluxes, including the major nutrient input fluxes of N2 fixation (Cleveland et al., 1999) and phosphatase activity, induced phosphorus input flux (Margalef et al., 2017). Second, isotopic tracer studies (e.g., 15N and 33P) are particularly useful to inform partitioning of N and P between plants and microbes, a key determinant of ecosystem C dynamics (Kuzyakov & Xu, 2013), and relative competitiveness of individual competitors (F Chapin & Bloom, 1976; Keuper et al., 2017; Zhu, Iversen, et al., 2016; Zhu & Riley, 2015; Zhu et al., 2017; Zhu, Riley, et al., 2016). Third, transient responses of the carbon cycle to nutrient availability can provide insight into nutrient limitation effects. In this sense, N and P fertilization experiments (LeBauer & Treseder, 2008), especially those that span multiple years, and FACE experiments (Zaehle et al., 2014), are particularly useful to inform carbon-nutrient interactions.
4 Conclusions
In this study, we show new developments in the E3SM land model (ELMv1-ECA) in terms of nitrogen and phosphorus cycles and their coupling with carbon dynamics. ELMv1-ECA has several new features in terms of nutrient dynamics and carbon-nutrient coupling that are conceptually different from its predecessors (CLM4.0-CN and CLM4.5-BGC). We benchmark ELMv1-ECA, CLM4.0-CN, and CLM4.5-BGC carbon cycle predictions using the ILAMB (International Land Model Benchmarking) package and conclude that ELMv1-ECA robustly represents the present-day carbon cycle by comparison with observed gross primary productivity, total ecosystem respiration, LAI, ET, vegetation biomass, and soil carbon stocks. The new model is a substantial improvement compared to its predecessor models. We also show that model benchmarking against multiple data sets is helpful and conclude that continuously benchmarking throughout model development can help improve land model performance.
Acknowledgments
This research was supported by Energy Exascale Earth System Modeling (E3SM, https://e3sm.org/) Project and the Reducing Uncertainties in Biogeochemical Interactions through Synthesis and Computation (RUBISCO) Scientific Focus Area, which are sponsored by the Earth and Environmental Systems Modeling (EESM) Program under the Office of Biological and Environmental Research of the U.S. Department of Energy Office of Science. Lawrence Berkeley National Laboratory (LBNL) is managed by the University of California for the U.S. Department of Energy under contract DE-AC02-05CH11231. Oak Ridge National Laboratory (ORNL) is managed by UT-Battelle, LLC, for the U.S. Department of Energy under contract DE-AC05-00OR22725. All data are available at figshare data repository: https://figshare.com/articles/ilamb-ELMECA-data_zip/5722369. The E3SM model code is available at https://e3sm.org/.