Volume 128, Issue 3 e2022JG007187
Research Article
Open Access

The Impact of Crop Rotation and Spatially Varying Crop Parameters in the E3SM Land Model (ELMv2)

Eva Sinha

Corresponding Author

Eva Sinha

Pacific Northwest National Laboratory, Richland, WA, USA

Correspondence to:

E. Sinha,

[email protected]

Contribution: Conceptualization, Methodology, Validation, Formal analysis, ​Investigation, Resources, Writing - original draft, Writing - review & editing, Visualization

Search for more papers by this author
Ben Bond-Lamberty

Ben Bond-Lamberty

Pacific Northwest National Laboratory, Joint Global Change Research Institute, College Park, MD, USA

Contribution: Conceptualization, Methodology, Writing - review & editing, Supervision, Project administration, Funding acquisition

Search for more papers by this author
Katherine V. Calvin

Katherine V. Calvin

Pacific Northwest National Laboratory, Joint Global Change Research Institute, College Park, MD, USA

Contribution: Conceptualization, Methodology, Supervision, Project administration, Funding acquisition

Search for more papers by this author
Beth A. Drewniak

Beth A. Drewniak

Argonne National Laboratory, Lemont, IL, USA

Contribution: Resources, Writing - review & editing, Supervision

Search for more papers by this author
Gautam Bisht

Gautam Bisht

Pacific Northwest National Laboratory, Richland, WA, USA

Contribution: Methodology, Resources, Writing - review & editing

Search for more papers by this author
Carl Bernacchi

Carl Bernacchi

Global Change and Photosynthesis Research Unit, USDA-ARS, Urbana, IL, USA

University of Illinois at Urbana-Champaign, Urbana, IL, USA

Contribution: Resources, Writing - review & editing

Search for more papers by this author
Bethany J. Blakely

Bethany J. Blakely

University of Illinois at Urbana-Champaign, Urbana, IL, USA

Contribution: Resources

Search for more papers by this author
Caitlin E. Moore

Caitlin E. Moore

University of Illinois at Urbana-Champaign, Urbana, IL, USA

School of Agriculture and Environment, The University of Western Australia, Crawley, WA, Australia

Contribution: Resources, Writing - review & editing

Search for more papers by this author
First published: 16 March 2023

Abstract

Earth System Models (ESMs) are increasingly representing agriculture due to its impact on biogeochemical cycles, local and regional climate, and fundamental importance for human society. Realistic large scale simulations may require spatially varying crop parameters that capture crop growth at various scales and among different cultivars, as well as common crop management practices, but their importance is uncertain, and they are often not represented in ESMs. In this study, we examine the impact of using constant versus spatially varying crop parameters using a novel, realistic crop rotation scenario in the Energy Exascale Earth System Model (E3SM) Land Model version 2 (ELMv2). We implemented crop rotation by using ELMv2's dynamic land unit capability, and then calibrated and validated the model against observations collected at three AmeriFlux sites in the US Midwest with corn soybean rotation. The calibrated model closely captured the magnitude and observed seasonality of carbon and energy fluxes across crops and sites. We performed regional simulations for the US Midwest using the calibrated model and found that spatially varying only a few crop parameters across the region, as opposed to using constant parameters, had a large impact, with the carbon fluxes and energy fluxes both varying by up to 40%. These results imply that large scale ESM simulations using spatially invariant crop parameters may result in biased energy and carbon fluxes estimation from agricultural land, and underline the importance of improving human-earth systems interactions in ESMs.

Plain Language Summary

Crops are increasingly being characterized in global land models because of their impact on local and regional climate. However, there is limited understanding of the impact of crop rotation and of different crop cultivars on carbon and energy fluxes from the land surface. Our study implements crop rotation and spatially varying crop parameters in the Energy Exascale Earth System Model Land Model and finds that doing so improves carbon and energy flux estimation from cropland area. These findings emphasize the importance of capturing agricultural management practices and variability in growth characteristics across different crop cultivars in global land models.

Key Points

  • This study implements corn soybean rotation and spatially varying crop parameters in the Energy Exascale Earth System Land Model

  • The model is calibrated and validated against observations collected at three AmeriFlux sites in the US Midwest

  • We find that spatially varying crop parameters resulted in improved flux estimation from cropland areas

1 Introduction

Agriculture affects local, regional, and global climate through greenhouse gas emissions, and modifications of the biogeochemical cycles, water and energy budget (McDermid et al., 2017). Agricultural land conversion, expansion, and intensification is a major source of greenhouse gas (GHG) emissions, contributing 23% of the total anthropogenic emissions of GHGs (IPCC, 2019). Agricultural intensification also impacts local and regional temperature and precipitation via modification of surface energy partitioning and an increase in evapotranspiration (Lobell et al., 2006; Mueller et al., 2016; D. Lombardozzi et al., 2018). Due to these impacts, and because of its fundamental importance to human societies and well-being, agriculture is increasingly being represented in Earth System Models (ESMs) (Drewniak et al., 2013; Levis et al., 2012; Liu et al., 2016; Osborne et al., 2015; Wu et al., 2016).

ESMs however, lack adequate representation of crop rotation, a management practice that is dominant in North America and common worldwide (Sahajpal et al., 2014; Wallander, 2013). Crop rotation, where different crops are grown on the same land across a sequence of growing seasons, has been practiced since historical times as it improves soil quality (Karlen et al., 2006), increases soil carbon sequestration (West & Post, 2002), enhances microbial richness and diversity (Venter et al., 2016), and increases crop yield while reducing fertilizer requirements (Bowles et al., 2020; Smith et al., 2008; Stanger & Lauer, 2008). Crop rotation can also help mitigate and adapt to climate change due to its potential for carbon sequestration and reducing nitrogen loss from agricultural systems (Lal et al., 2011; Wang et al., 2010).

Crop rotation is increasingly being implemented in land model components of ESMs. Sequential cropping, where multiple crops are grown on the same land in a given year, has been implemented in CLM5.0 for site level data in central Europe (Boas et al., 2021) and in Joint UK Land Environment Simulator (JULES) at site-level in France and regional level in India (Mathison et al., 2021). Crop rotation was also previously implemented in CLM5.0 for a single site in the US Midwest (Cheng et al., 2020). All of these studies modified the crop parameters based on values in the literature, field observation, or calibration using a simple one-at-time approach that varies a single model parameter at a time. This simplistic parameterization approach, however, fails to account for the impact of joint parameter variability on model outputs (Qian et al., 2018; Ricciuto et al., 2018) and may thus fail to accurately capture fluxes from croplands.

Adequate crop representation depends on calibrating various crop parameters, and large scale ESM simulations likely require parameters that are scale- and cultivar-dependent. Calibrating ESM crop parameters using site-level observations is challenging due to the limited availability of observational data and the computational cost involved with model calibration and validation. Due to these limitations static or spatially invariant crop parameters are often used for regional/global runs (Osborne et al., 2009; Levis et al., 2012; Drewniak et al., 2013; D. L. Lombardozzi et al., 2020). However, crop parameters can be scale- and cultivar-dependent (Iizumi et al., 2014; Mohammadi, 2007) and therefore spatially invariant parameters can result in biases between observed and simulated fluxes and are therefore not recommended for large scale simulations (Iizumi et al., 2014). The prevalence and magnitude of such a bias are poorly understood, however.

The objective of this study is to understand the impact of using constant versus varying crop parameters in a realistic crop-rotation scenario, and quantify the resulting model's fidelity against high-quality AmeriFlux observational data. To do so, we enhance the crop modeling capability of the Energy Exascale Earth System Model (E3SM) land component version 2 (ELMv2) by implementing corn soybean rotation in ELM; calibrate and validate the model using observations from multiple sites; and quantify the impact of different parameterization schemes on carbon and energy fluxes in a regional North America simulation.

2 Methodology

2.1 ELM Crop Model

The E3SM land model version 2 (ELMv2) is branched from CLM version 4.5 (CLM4.5) (Oleson et al., 2013). The major additions to ELM since diverging from CLM4.5 are described in detail in Golaz et al. (2022), Burrows et al. (2020), and Ricciuto et al. (2018); they include improved representation of atmospheric aerosols, a minor bug fix in evaporation estimation from pervious surfaces, an updated scheme for calculation of leaf stomatal conductance, and modification to the nighttime albedo calculation. The ELM crop model includes representation of major crop types in order to capture the biogeochemical and biophysical impact of crops on land surfaces (Drewniak et al., 2013; Levis et al., 2012). To date it has not included any crop rotation capability.

2.2 Corn Soybean Rotation Implementation

We implemented a corn soybean rotation, the most common such rotation type in North America (Wallander, 2013), in ELM by using the model's dynamic land unit capability. Similar to the Community Land Model (CLM) version 5.0 (Lawrence et al., 2019), dynamic land units allow for the fraction of crop functional types (cfts) in each soil column to be adjusted over time, as specified in the model's input land use time series. Modifying the cfts percentage from 1 year to another thus results in a realistic rotation between two or more crops. For instance, corn soybean rotation for site-scale simulation was represented in the land use time series by switching from 100% corn to 100% soybean for the crop rotation years. A corn soybean rotation for our regional simulation was implemented in the land use time series based on the Land-Use Harmonization 2 (LUH2) transition data set and is described in Section 2.4.1.

2.3 Site-Scale Calibration and Validation

2.3.1 Site Level Data

The model was calibrated and validated based on observations from three corn soybean rotation sites in the US Midwest (Figure 1). Site location and mean climatic conditions are summarized in Table 1; all sites are rainfed, that is, have no irrigation. At the US-Ne3 and US-Ro1 sites, rotation occurred every year between 2001–2014 and 2004–2016, respectively. At the US-UiC site, the rotation consisted of two years of corn plantation followed by one year of soybean plantation, from 2008 to 2016. Meteorological forcing data collected at the three sites including, air temperature, precipitation, downwelling shortwave radiation, downwelling longwave radiation, humidity, air pressure, and wind speed, was utilized for model simulation. For US-Ne3 we used meteorological forcing data collected between 2002 and 2015, for US-Ro1 between 2009 and 2012, and for US-UiC between 2011 and 2016.

Details are in the caption following the image

Location of AmeriFlux observational sites and three sub-regions of the US Midwest used for the regional run. Observational sites used for site level calibration and validation are shown in red and sites used for regional validation are shown in green.

Table 1. Observational Sites Used for Site Level Calibration and Validation and Regional Validation
Usage Site ID City State Latitude Longitude Elevation (m) Mean annual temp (°C) Mean annual precip (mm) Citation
Site level calibration/validation US-Ne3 Mead NB 41.18 −96.44 363 10.1 784 Suyker (2022)
US-Ro1 Rosemount MN 44.71 −93.09 290 6.4 879 Baker and Griffis (2018)
US-UiC Champaign IL 40.07 −88.20 224 10.9 1,051 Bernacchi (2022)
Regional validation US-Bo1 Bondsville IL 40.01 −88.29 219 11.2 991 Meyers (2016)
US-Br1 Brooks IL 41.97 −93.69 313 9.0 842 Prueger and Parkin (2016)
US-IB1 Batavia IL 41.86 −88.22 227 9.2 929 Matamala (2019)

The data for the US-Ne3 site is part of the FLUXNET 2015 data set that was gap-filled and processed based on the methodology described in Pastorello et al. (2020). The gap filled and partitioned data for the US-Ro1 site was downloaded from the AmeriFlux website (downloaded in April 2021). For the US-UiC site, the gross primary productivity (GPP) and ecosystem respiration (ER) were calculated using standard methodologies from net ecosystem exchange values measured at the eddy covariance flux towers (Moore et al., 2020). The flux tower derived GPP and ER are referred to as observed GPP and ER, respectively, in the remainder of the manuscript. The US-UiC data is not currently available on the AmeriFlux website, but will be in the future. We converted the half-hourly and hourly data from these sites into daily averages for calibrating and validating ELMv2.

2.3.2 Model Calibration

We used carbon and energy flux measurements at the three sites to calibrate the model, and leaf area index (LAI), canopy height, and harvest yield to validate it. Simulated harvest yield here refers to grain harvest and captures the carbon flux into the grain pool.

The model parameters were calibrated similar to Sinha et al. (2022) by first developing an ELM surrogate model across a range of input parameters, followed by sensitivity analysis to identify the most influential parameters, and lastly performing Bayesian calibration of these surrogate models to find optimum ELM parameter values.

We identified 12 crop parameters related to crop phenology, crop management, CN allocation, and photosynthetic capacity whose parameters values are most uncertain. The input range for these parameters was identified based on literature review and expert judgment (Table 4). The parameter values for these parameters were randomly varied over their uniform prior range to generate 2,000 ELM simulations. The default value for other crop parameters that were not optimized are listed in Table S1 in Supporting Information S1. ELM simulations were submitted via the Offline Land Model Testbed (Ricciuto, 2022) and each ran for 200 years in the accelerated spin-up mode, 200 years in the non-accelerated spin-up mode (Thornton & Rosenbloom, 2005), and 165 years in transient mode from 1850 to 2015. The 2,000 spin-up simulations were performed to bring the carbon pool to equilibrium for 1850 climatic conditions. Crops were then activated in the transient simulations by using the land use time series containing information on land cover change and crop rotation. Corn and soybean crops were rotated for eight to fourteen years corresponding to the observed crop rotation years at the AmeriFlux sites. At all three AmeriFlux sites, c3 grass was simulated for years prior to the start of crop rotation. For the spin-up and transient simulations, the available forcing data for each site was recycled. The model output was postprocessed for four output Quantity of Interests (QoIs)—gross primary productivity (GPP), ecosystem respiration (ER), latent heat flux (LE), and sensible heat flux (H). The post processing involved estimating daily average over the last 10 years of the transient run for the four QoIs that were then used for developing surrogates for each day of the year for the four QoIs for each crop and for each site. Similar to Sinha et al. (2022)'s approach, 1600 ELM simulations were used for developing the surrogates and 400 were used for testing the accuracy of surrogates.

Sobol sensitivity indices were used to examine parametric uncertainty (Saltelli et al., 2010; Sobol, 2001) and identify the most influential parameters for reducing this uncertainty in model outputs. We evaluated the main effect sensitivity that estimates the contribution of one parameter at a time to the total variance in the output variable.

Model parameters were calibrated to better match the model outputs to observations. We used Markov chain Monte Carlo (MCMC) to sample the parameter input space and reduce the bias between model output and observations. MCMC's requirement of large number of model evaluations was met by using computationally inexpensive surrogate models (see above) instead of computationally expensive ELM model outputs. Calibration was performed simultaneously for all four QoIs to identify a single set of parameter values for each crop. We limited the maximum number of parameters for calibration to five, and selected parameters for which the probability density function of the optimized parameter was normally distributed within the input range instead of being skewed to either side of the input range. Since GPP is low or negligible during the non-growth period, we calibrated it only during the growth period; this GPP calibration window was from May 8 to October 10 for US-Ne3 corn; May 29 to September 26 for US-Ne3 soybean; June 3 to October 2 for US-Ro1 corn; June 19 to September 12 for US-Ro1 soybean; April 30 to October 27 for US-UiC corn; and April 30 to October 27 for US-UiC soybean. The other three QoIs were calibrated using observations for all days in the year.

2.3.3 Model Validation

For each site-crop, 2 years of carbon flux, energy flux, leaf area index (LAI), canopy height, and harvest measurements were used for model validation (Table 2).

Table 2. Calibration and Validation Years
Corn Soybean
Usage Site Calibration years Validation years Calibration years Validation years
Site level calibration/validation US-Ne3 2001, 2007, 2009, 2011, 2013 2003, 2005 2002, 2008, 2010, 2012, 2014 2004, 2006
US-Ro1 2005, 2007, 2009, 2011 2013, 2015 2004, 2006, 2008, 2010, 2012 2014, 2016
US-UiC 2009, 2011, 2012 2014, 2015 2010, 2013 2016
Regional validation US-Bo1 1997, 1999, 2001, 2003, 2005, 2007 1998, 2000, 2002, 2004, 2006, 2008
US-Br1 2005, 2007, 2009, 2011 2006, 2008, 2010
US-IB1 2006, 2008, 2010, 2012, 2014, 2016, 2018 2005, 2007, 2009, 2011, 2013, 2015, 2017

The optimized parameter obtained from model calibration were utilized for running a single model simulation for each site. Similar to the calibration runs, the validation simulation ran for 200 years in the accelerated spin-up mode, 200 years in the nonaccelerated spin-up mode, followed by a transient run from 1850 to 2015 that used site specific meteorological data. For performing model validation, simulated carbon fluxes and energy fluxes were compared to the observations for the validation years, while LAI and annual harvest yield were compared to observations for all years since LAI and harvest were not utilized for model calibration.

2.4 Regional Analysis

2.4.1 Generation of Corn Soybean Rotation Historical Landuse Timeseries

Corn soybean rotation was represented in the historical land use time series from 2000—2015 based on information in the LUH2 historical transition data set (Hurtt et al., 2020). LUH2 provides landuse transition information at an annual temporal resolution and at 0.25° spatial resolution between five cfts: C3 annuals, C4 annuals, C3 perennials, C4 perennials, and C3 nitrogen fixers. Several crop types are aggregated into each of these five cfts. For the US-Midwest, the crop with the largest harvested acres in the C4 annual cft is corn while in the C3 nitrogen fixer cft is soybean. Therefore, the LUH2 historical transition from C4 annual (c4ann) to C3 nitrogen fixer (c3nfx) was utilized for generating corn soybean rotation for the US-Midwest between 2000 and 2015. This transition is represented as unit fraction per gridcell (fracc4ann_to_c3nfx). The corn soybean rotation was implemented in the landuse timeseries by, first, identifying grid cells with fracc4ann_to_c3nfx greater than 5% within the United States (Figure S1c in Supporting Information S1). Second, the fraction of corn or soybean in ELM land use timeseries gridcell was modified such that in even years between 2000 and 2015 the fraction of soybean was transferred to corn, while in odd years the fraction of corn was transferred to soybean (Equations 1 and 3). During this time period, the fractions of soybean in even years and corn in odd years were reduced corresponding to the increase in the other crop; this maintained the total corn and soybean area (Equations 2 and 4). Prior to 2000, crop rotation was not implemented in the landuse time series.

For even years between 2000 and 2015:
urn:x-wiley:21698953:media:jgrg22417:jgrg22417-math-0001(1)
urn:x-wiley:21698953:media:jgrg22417:jgrg22417-math-0002(2)
For odd years between 2000 and 2015:
urn:x-wiley:21698953:media:jgrg22417:jgrg22417-math-0003(3)
urn:x-wiley:21698953:media:jgrg22417:jgrg22417-math-0004(4)

2.4.2 Regional Simulation

Regional simulations were performed for the Corn Belt in the US Midwest divided into three sub-regions: Northern Rockies, Upper Midwest, and Ohio Valley. These sub-regions were roughly based on the NOAA's US climatic regions (https://www.ncdc.noaa.gov/monitoring-references/maps/us-climate-regions) and each sub-region contained one of the calibration sites (Figure 1). Importantly, we used these sub-regions only to demonstrate the impact of constant versus spatially varying parameters; they do not represent fixed application boundaries for the optimized parameters. We performed five regional simulations using the optimized parameters obtained from the three calibration sites. Of these, three (Set1, Set2, and Set3) used the same crop parameters for all three regions, while the fourth set (Composite) incorporated varying crop parameters from all three sub-regions (Table 3). Finally, a fifth regional simulation (Default) was performed with default (uncalibrated) ELM crop parameters. Corn soybean rotation was implemented in all five regional simulations.

Table 3. Crop Parameters Used for Different Sets and Regions
Set/Sub-regions Northern Rockies Upper Midwest Ohio Valley
Set1 for all sub-regions based on US-Ne3 optimization
Set2 for all sub-regions based on US-Ro1 optimization
Set3 for all sub-regions based on US-UiC optimization
Composite US-Ne3 optimization US-Ro1 optimization US-UiC optimization
Table 4. Descriptions, Input Ranges, and Sources of Information Used for the Twelve Input Parameters Varied in This Study
Default Range
Parameter ELM variable Units Description Corn Soybean Corn Soybean Source
urn:x-wiley:21698953:media:jgrg22417:jgrg22417-math-0005 planting_temp K Average 10-day temperature required for plant emergence 287 288 287–293 287–293 1
urn:x-wiley:21698953:media:jgrg22417:jgrg22417-math-0006 declfact Decline factor for gddmaturity 1.05 1.05 0.7–1.575 0.7–1.575 1
fertnitro kgN m−2 Maximum fertilizer to be applied 0.015 0.0025 0.01–0.02 0.002–0.003 1
urn:x-wiley:21698953:media:jgrg22417:jgrg22417-math-0007 lfemerg Leaf emergence parameter 0.03 0.03 0.01–0.05 0.01–0.05 1
mxmat Maximum number of days to maturity 165 150 125–175 125–175 1
urn:x-wiley:21698953:media:jgrg22417:jgrg22417-math-0008 hybgdd °day Growing degree days required for maturity 1,700 1,900 1,275–2,125 1,425–2,375 2
leafcn gC gN−1 Leaf CN ratio 25 25 8–25 8–25 3
urn:x-wiley:21698953:media:jgrg22417:jgrg22417-math-0009 laimx Maximum leaf area index 5 6 4–7 3–7 4
SLA slatop m2 gC−1 Specific leaf area (SLA) at top of canopy, projected area basis 0.05 0.07 0.03–0.08 0.02–0.07 5
br_mr × 10−6 umol CO2 m−2 s−1 Base rate for maintenance respiration (MR) 2.52 2.52 1.26–3.75 1.26–3.75 6
q10_mr Temperature sensitivity for MR 2.2 2.2 1.3–3.3 1.3–3.3 6
mbbopt Ball–Berry model equation slope 4.0 9.0 4–12 4–12 7
  • Note. The ranges are based on (a) expert judgment (in the case where there is insufficient literature, but within 25% of the default value would be inappropriate) (b) within 25% of the default value; (c) Srivastava et al. (2006) and Li et al. (2019) (d) Baez-Gonzalez et al. (2005) and Nguy-Robertson et al. (2012) (e) Nagasuga et al. (2014) (f) Ricciuto et al. (2018) (g) Personal communication with Dr. Dan Ricciuto.

Similar to the site-level simulations, the regional simulations involved running the model in accelerated spin-up mode for 200 years, followed by nonaccelerated spin-up mode for 200 years, and transient run from 1850 to 2015. The spin-up simulations were performed to bring the carbon pool to equilibrium for 1850 climatic conditions; crops were then activated in the transient simulations by using the land use time series containing information on land cover change and crop rotation. For the spin-up simulations, natural vegetation consisting of c3 grass, c4 grass, and broadleaf deciduous temperate trees was used in place of croplands. For the regional simulations, meteorological forcing was based on the Global Soil Wetness Project Phase 3 (GSWP3) data. The GSWP3 data set was chosen since it has shown better agreement with benchmark for both forcing variables and CLM5 output variables, when forced with the GSWP3 forcing data set, compared to other forcing datasets (Lawrence et al., 2019). The GSWP3 forcing data is available from 1901 to 2014 and therefore for transient runs from 1850 to 1900 and spin-up simulations, GSWP3 forcing data from 1901 to 1920 was recycled the 1920–2014 transient runs utilized the GSWP3 data from that period.

ELM currently only accepts spatially and temporally constant crop parameters. Therefore, ELM outputs for the composite set with spatially varying crop parameters were generated by combining the outputs from Set1, Set2, and Set3; output value for grid cells within the Northern Rockies were obtained from Set1, for grid cells within the Upper Midwest from Set2, and for grid cells within the Ohio Valley from Set3.

2.4.3 Regional Validation

We used FluxCom (Jung et al., 2020) based GPP measurements for validating the GPP simulated for the US-Midwest. We compared model simulated GPP to a median of a 30 member ensemble of FluxCom GPP estimates based on remote sensing and meteorological data. Additionally, we validated regionally simulated GPP and LE to site level observations. We validated simulated GPP by comparing against the three AmeriFlux sites used for model calibration and validation (Section 2.3.1) and validated LE by comparing against these three sites and three additional AmeriFlux sites (US-Bo1, US-Br1, and US-IB1) in the US Midwest with corn soybean rotation (Figure 1 and Table 1). Similar to the US-Ro1 site, the gap filled and partitioned data for these three additional sites was downloaded from the AmeriFlux website (downloaded in April 2021). At the US-Bo1, US-Br1, and US-IB1 sites, corn-soybean rotation occurred every year between 1997 and 2008, 2005 and 2011, and 2005 and 2018, respectively (Table 2). Since, observations for the six AmeriFlux sites were available for different years hence, we validated regionally simulated GPP and LE by comparing observations across all years to transient simulations from 2001 to 2010.

3 Results

3.1 Site-Scale Calibration and Validation

The most influential parameters for both corn and soybean were similar across the three sites, with three parameters (leafcn, mbbopt, and planting_temp) being common across both crops and sites (Figures S2, S3, S4, S5 in Supporting Information S1 and Table 5). For both crops, the parameters controlling plant phenology (planting_temp and hybgdd) were among the most influential parameters across all four QoIs, except hybgdd for US-Ro1. Another phenological parameter that controls the maximum number of days required to reach maturity, mxmat, was among the five most influential parameters for few sites, crops, and QoIs. Across all three sites, the parameter associated with stomatal conductance (mbbopt) was more sensitive for soybean than for corn. The parameters controlling leaf CN allocation (leafcn) and top of canopy specific leaf area (slatop) were both identified as influential parameters across crops and sites, with leafcn being generally more influential for carbon fluxes and slatop more influential for energy fluxes. The parameter controlling the base rate for maintenance respiration (br_mr) was identified as the most influential parameter for ER across sites and crops during the non-growing season and less influential during the growing season. Since maintenance respiration is negligible during the non-growing season, br_mr was selected among the most influential parameters for only few sites and crops (Table 5).

Table 5. Most Sensitive Parameters and Their Optimum Value After Calibration
Input range US-Ne3 US-Ro1 US-UiC
Parameter Corn Soybean Corn Soybean Corn Soybean Corn Soybean
br_mr × 10−6 1.26–3.75 1.26–3.75 3.74 3.54 3.14
hybgdd 1,275–2,125 1,425–2,375 1,383 1,428 1,434 1,331 1,425
leafcn 8–25 8–25 18 22 25 22 25 25
mbbopt 4–12 4–12 4.9 6.0 12.0 4.5 4.8 6.7
mxmat 125–175 125–175 146
planting_temp 287–293 287–293 290 291 291 290 291.7 289
slatop 0.03–0.08 0.02–0.07 0.03 0.08 0.03

The optimized parameter values varied across sites and crops (Table 5) with parameters being more similar across the US-Ne3 and US-UiC site than the US-Ro1 site. The final parameter values were based on rounding off the optimized parameter value, averaging the optimized values for non-cft specific parameter (br_mr), and using default values for parameters that were not optimized (Table 6). The US-Ro1 corn calibration resulted in optimized parameter values for mbbopt and slatop much higher than other crops and sites. For this site, meteorological forcing data was available for only few years that may have contributed to the higher estimated parameter values.

Table 6. Parameters Used for Validation and Regional Runs
US-Ne3 US-Ro1 US-UiC
Parameter Corn Soybean Corn Soybean Corn Soybean
br_mr × 10−6 3.1a 3.3a 2.52b
hybgdd 1,400 1,400 1,700b 1,400 1,300 1,400
leafcn 18 22 25 22 25 25
mbbopt 5 6 12 4.5 5 7
mxmat 165b 150b 165b 150b 165b 150
planting_temp 290 291 291 290 291 289
slatop 0.03 0.07b 0.08 0.07b 0.03 0.07b
  • a br_mr parameter is not cft specific hence average of the optimized value for corn and soybean was used.
  • b For parameters that were not optimized during calibration their default values were used.

In general, the calibrated model captured the observed seasonality and magnitude of GPP, ER, and LE with the fraction of the variance explained being higher for carbon fluxes than energy fluxes. The calibrated GPP matched the observed seasonality and peak magnitude for both crops at the three sites except the timing of leaf senescence for both crops at US-Ro1 and the peak GPP for corn at US-UiC (subplots A and B in Figure 2, Figures S6 and S7 in Supporting Information S1). Overall, within the calibration window, the posterior GPP estimates explained 97% and 87% of the observed daily variance at the US-Ne3 site, 85% and 77% of variance at US-Ro1 site, and 93% and 89% of the variance at the US-UiC site for corn and soybean, respectively (Table S2 in Supporting Information S1). Across crops and sites, the calibrated ER matched the observed seasonality and the magnitude during the growing period, however the simulated magnitude differed from observations for most crop/sites during the non-growth period and for US-Ne3 soybean during the growing period (subplots C and D in Figure 2, Figures S6 and S7 in Supporting Information S1). The posterior ER estimates explained more than 80% of the observed daily variance across crops and sites (Table S2 in Supporting Information S1). The posterior estimates of latent heat flux captured the observed seasonality and magnitude, although sensible heat flux was not well calibrated, especially for soybean (subplots E—H in Figure 2, Figures S6 and S7 in Supporting Information S1).

Details are in the caption following the image

Model calibration: Observed versus prior and posterior distribution of the modeled GPP (gC m−2 day−1), ER (gC m−2 day−1), LE (W m−2), and, H (W m−2) for corn and soybean at US-Ne3. The prior distribution (red shade) represents the daily simulated values for the 2000 ensemble members while the posterior distribution (green shade) represents the calibrated values estimated with the optimized parameters. The black line represents observed average daily across the calibration years (Table 2).

The model closely captured the seasonality and peak magnitude of various fluxes across most, but not all, crops and sites (Figure 3, Figures S8 and S9 in Supporting Information S1). Seasonality of carbon and energy fluxes was well reproduced across crops and sites, except for the sensible heat flux for soybean and leaf senescence timing at US-Ro1. The peak flux magnitude for GPP, ER, and LE, was well captured for corn at all three sites, except ER at US-UiC; while for soybean the peak magnitude of these fluxes was underestimated. The markedly higher observed corn ER at US-UiC during the validation years, 51% and 57% higher than 10-year average (Moore et al., 2022), resulted in the large difference between simulation and observations (Figure S9c in Supporting Information S1). The model was unable to reproduce the peak magnitude for sensible heat flux across crops and sites, except for corn at US-UiC.

Details are in the caption following the image

Model validation: Observed versus simulated GPP (gC m−2 day−1), ER (gC m−2 day−1), LE (W m−2), and, H (W m−2) for corn and soybean at US-Ne3 using optimized parameter values (Table 6). The red lines represent daily average model simulation over the last 10 years of transient run and the black line represents observed daily average values over the validation years (Table 2). Relative Root Mean Square Error (RRMSE) is dimensionless and represents the root mean square error (RMSE) normalized by the root mean square observations.

The simulated values of LAI and yield, outputs not used for calibration, were lower than observations. The simulated peak LAI magnitude was lower than observations across crops and sites, with the difference between observed and simulated much smaller at the US-Ne3 site than at the other two sites (Figure 4, Figures S10 and S11 in Supporting Information S1). The simulated harvest captured the yearly harvest variability for the US-Ro1 site, was toward the lower end of the observed harvest for the US-UiC site, and was underestimated for the US-Ne3 site (Figure 5, Figures S12 and S13 in Supporting Information S1).

Details are in the caption following the image

Model validation: Observed versus simulated leaf area index (LAI) for corn and soybean at US-Ne3 using optimized parameter value (Table 6). The black lines represent simulated LAI over the calibration and validation years. The blue (corn) and orange (soybean) circles represents observed weekly LAI.

Details are in the caption following the image

Model validation: Observed versus simulated crop harvest for corn and soybean using optimized parameter value (Table 6). The orange bars represent simulated annual harvest over the calibration and validation years and the gray bars represents observed harvest. Light yellow background represents corn years and light blue background represents soybean years.

Simulations closely capture distinctly different GPP patterns between corn and soybean after implementation of crop rotation in ELM. At the US-Ne3 site, observed annual GPP for the soybean years was approximately 60% of the annual GPP for corn and this large variability between the two crops was accurately captured by ELM (Figure S14 in Supporting Information S1). At the US-Ne3 site 100% of crop is rotated whereas crop rotation occurs in less than 20% of the gridcell fraction in the regional simulation (Figure S1 in Supporting Information S1). In the regional simulation, at the cft level, annual GPP varies by approximately 20% in grid cells with maximum corn soybean rotation, while at the grid level the difference in annual GPP was negligible. This is because each grid cell is comprised of several landunits that in turn consists of various cfts. In addition, only a fraction of the corn cft undergoes crop rotation. Thus, when crop rotation impact is scaled up to the gridcell level it becomes negligible.

3.2 Regional Simulation

We found that varying only few parameters across the region had a large impact on carbon and energy fluxes. Annual GPP and ER varied by up to 40% and 35% (Figure 6 and Figure S16 in Supporting Information S1), respectively; the difference in fluxes was driven by both corn and soybean (Figure S15 in Supporting Information S1). Using non-regional parameters produced large changes in fluxes, both positive and negative. For example, grid cells in the Ohio Valley had both higher and lower fluxes in various regions when optimized crop parameters from the two other regions were utilized (subplots b, c, e, and f in Figure 6 and Figure S16 in Supporting Information S1).

Analogous to carbon fluxes, energy fluxes also varied across the regions due to difference in crop parameters. For the summer months from June to September, different crop parameters resulted in LE varying by up to 15% (Figure S17 in Supporting Information S1) and H varying by up to 40% (Figure S18 in Supporting Information S1). Additionally, the impact of different crop parameters varied across different months. For example, grid cells located in the Upper Midwest observe a lower LE flux in July and higher in August when optimized crop parameters from the Northern Rockies were used for their simulation (subplots f and j in Figure S17 in Supporting Information S1).

Optimized and spatially varying crop parameters reduced the difference between observed and simulated annual GPP as compared to the default uncalibrated parameters. At a regional scale, the absolute average difference in simulated and FluxCom (Jung et al., 2020) based annual GPP from 2001 to 2010 reduced from 46% to 25% when default crop parameters were replaced with calibrated and spatially-varying crop parameters (Figure 7). Site level comparison across the three calibration sites yielded similar results with annual GPP for corn(soybean) varying by an average of 43%(71%) compared to 1%(8%), for the default and calibrated and crop parameters, respectively (Figure 8). For the site level comparison, grid cells containing the observational sites were selected from the regional simulation. Similar to annual GPP, the relative root mean square error (RMSE) between simulated and observed LE was lower when calibrated crop parameters were used as compared to default for 7 out of 12 crop-sites (Figure S20 in Supporting Information S1). Site level comparison of monthly fluxes revealed that the simulated growing season shifted by approximately a month compared to the observations; however, the peak simulated GPP was closer to the observations when calibrated crop parameters were used (Figure S19 in Supporting Information S1). The seasonal shift is likely due to the usage of GSWP3 meteorological forcing instead of the site specific forcing used for calibration results. The shift in the simulated growing season also occurred in monthly LE for the three calibration sites and three additional sites in the US-Midwest were LE is routinely measured (Figure S20 in Supporting Information S1).

Details are in the caption following the image

Impact of constant versus varying parameters on annual GPP: Total annual gross primary productivity (GPP) estimated by using regionally varying parameters (a) difference in GPP (b–d) and percent difference in GPP (e–g) when using regional versus constant parameters for corn and soybean. Set1 is based on parameters obtained from calibrating US-Ne3, Set2 is based on US-Ro1 calibration, and Set3 is based US-UiC calibration. Composite set utilized parameters based on US-Ne3 calibration for the Northern Rockies, based on US-Ro1 calibration for the Upper Midwest, and based on US-UiC for the Ohio Valley (Figure 1). Annual simulated GPP are based on average of 10 years of transient runs from 2001 to 2010.

Details are in the caption following the image

Comparison of simulated annual GPP to FluxCom estimates: Annual GPP estimates based on FluxCom (Jung et al., 2020) (a), ELM default crop parameters (b), composite set with calibrated and spatially varying parameters (c), percent difference between FluxCom and default set (d), and percent difference between FluxCom and composite set (e). FluxCom GPP are based on average over 2001–2010 and simulated GPP are based on average of 10 years of transient runs from 2001 to 2010.

Details are in the caption following the image

Comparison of simulated and observed annual GPP at AmeriFlux calibration/validation sites: The simulated annual GPP was obtained from the regional run by identifying grid cells closest to the observation sites. Observed annual GPP are based on observations for both calibration and validation years and simulated GPP are based on average of 10 years of transient runs from 2001 to 2010. The default simulation utilized ELM default crop parameters while the composite simulation utilized parameters based on US-Ne3 calibration for the Northern Rockies, based on US-Ro1 calibration for the Upper Midwest, and based on US-UiC for the Ohio Valley (Figure 1).

Comparison of observed GPP with simulated annual GPP using calibrated but spatially invariant crop parameters (Set1, Set2, and Set3) reveals similar spatial patterns among the three regions. All three set of regional simulations overestimated GPP for most of the study region except for grid cells in Kentucky and Missouri (Figure S21 in Supporting Information S1). The absolute average difference in simulated and FluxCom based annual GPP from 2001 to 2010 was 29%, 22%, and 27% for Set1, Set2, and Set3, respectively. The absolute average difference for Set2 (22%) was slightly lower than for the composite set (25%). This is because Set2 simulates lower annual GPP for the entire US-Midwest region compared to the other Sets (Figures 6b–6d) bringing it closer to the FluxCom estimates of annual GPP (Figure S22 in Supporting Information S1) that is lower than simulated estimates for most of the agriculturally intensive US-Midwest (Figure 7).

4 Discussion

We found that spatially varying a small number of parameters had a large impact on carbon and energy fluxes; in particular, parameter optimization, and the use of spatially varying parameters, generally reduced the bias between simulated and observed fluxes. These results have implications for optimal model parameterization, the importance of considering spatial variability in parameters as well as implementing crop rotation, and pathways to addressing known existing model limitations in the future.

4.1 Model Parameterization: Optimization and Spatial Variability

The optimized parameter values estimated in this study differ across crop-sites but are within previously observed or modeled ranges. For instance, globally observational estimates of corn slatop have varied between 0.015 and 0.035 (m2g−1) (Amanullah et al., 2007; Mohammadi, 2007; H. Zhou et al., 2020). In this study, the calibrated value of corn slatop range between 0.03 and 0.08 (m2gC−1), that is equivalent to 0.014–0.034 (m2g−1) (assuming the leaf carbon content is 45% of leaf weight) and falls within the observed range. Similarly, our calibrated value of hybgdd for soybean (1400), although significantly lower than the default of 1900, is similar to the values used for soybean in the US Midwest by Bilionis et al. (2015). Finally, the optimized value of mbbopt for corn is slightly higher than 4, except for corn at US-Ro1, that is consistent with the ELM default of 4. However, for soybean the optimized values of mbbopt is less than 7 (Table 5) that is lower than the ELM default of 9 for c3 plants. Similar to our findings, Duarte et al. (2017) found that lowering mbbopt from the c3 default of 9 to 6 better captured GPP and LE in coniferous forest in the northwestern US. In summary, the optimized parameter values identified here for slatop, hybgdd, and mbbopt improved flux estimation from the three calibration sites and are also similar in magnitude to values reported in prior observational or model studies.

One of the primary lessons from our analysis is that using constant crop parameters instead of spatially varying parameters can result in under or overestimation of regional fluxes, to the extent that it would be impossible to accurately (i.e., without significant spatial biases) capture the impact of crops on local and regional climate via biogeochemical and biophysical impacts on the land surface. Importantly, the impact of constant versus spatially varying parameters differs spatially and temporally (Figure S17 in Supporting Information S1) and cannot be estimated by simple scaling of the effects, as is true for a range of other systematic (as opposed to random) errors in ecosystem- to global-scale observations (Richardson et al., 2006) and models (T. Zhou et al., 2009). These systematic model errors are often the reason why ecophysiological and biogeochemical models have more difficulty reproducing spatial variability than overall means, at scales ranging from regional biomass and carbon fluxes (Bond-Lamberty et al., 2007; Castanho et al., 2013) to global soil carbon pools (Todd-Brown et al., 2013). Here we use high quality observational data, from multiple AmeriFlux sites for managed croplands, to better capture the observed spatial variability in fluxes and quantify the parametric uncertainty introduced by using spatially-invariant parameters.

We found that parameter Set2 (based on data from US-Ro1) simulated annual GPP closest to FluxCom for the entire US-Midwest region (Figure S21 in Supporting Information S1), but this finding has two important caveats. First, satellite based estimates of cropland GPP, like FluxCom, have large uncertainty (Yuan et al., 2015), and therefore the lowest absolute average difference between FluxCom and a particular set does not mean that those optimized parameters are the best for the entire US-Midwest (Figure S22 in Supporting Information S1). Second, our results imply that the three AmeriFlux sites chosen for each region are not representative of the entire region, because sometimes fluxes estimated using optimized parameters from AmeriFlux site in another region are closer to observed fluxes; data from additional sites are thus needed to capture the spatial variability across the Midwest region.

4.2 Importance of Climate and Crop Rotation

Non site-specific climatic forcing increased the difference between observed and simulated flux. Annual GPP observed at the three calibration sites was slightly different than simulated annual GPP in the regional run (Figure 8). This difference, despite calibration to the same data (Section 3.1), can be attributed to (a) GSWP3 forcing being used for the regional simulation compared to the site specific forcing utilized for calibration, and (b) GPP for all available years being used for estimating average annual GPP as opposed to only for the calibration years. These findings are generally consistent with previous work documenting the strong impact of climatic forcing data, at a regional scale, on carbon fluxes from forested region (Dorheim et al., 2022) and on above ground biomass estimation for mountainous region (Duarte et al., 2022). At a global scale, climate forcing can contribute more than half of total uncertainty in carbon cycle fluxes (Bonan et al., 2019) and can be the dominant driver of variability for the net ecosystem flux (Hardouin et al., 2022). The large contribution of climate forcing to total uncertainty also implies that the difference between observed and simulated fluxes can be further reduced by using a forcing data set that is better suited to the region.

The implementation of a realistic crop rotation capability in ELM allowed us to accurately capture the difference in peak flux magnitude between corn and soybean. Similar to our findings, Boas et al. (2021) found that realistic crop rotation improved LAI and latent heat flux estimation for field sites in Europe. Our findings suggest that simulating crop rotation, that is widely practiced in the Continental United States as well as globally (Sahajpal et al., 2014; Wallander, 2013), is important for accurately capturing feedback between human and agricultural ecosystem. Crop rotation representation will be even more critical in future integrated Earth System Models with enhanced human-climate feedback capabilities (Calvin & Bond-Lamberty, 2018; Thornton et al., 2017).

Although our analysis found that impact of crop rotation on annual GPP is negligible when scaled up to the gridcell, we argue that it is still important to represent crop rotation in ESMs as it affects biogeochemical cycles apart from the carbon fluxes; for example, it can result in markedly different yields from year to year (Figure 5). Similarly, crop rotation reduces fertilizer requirements and reduces nitrogen loss from agricultural systems.

4.3 Model and Study Limitations

Both the ELM-Crop model and our study design have limitations that are important to note. In ELM, LAI is estimated as a product of specific leaf area parameter (slatop) and leaf carbon content. The underestimation of corn LAI for US-Ne3 and US-UiC in this study is likely caused by lower optimized value of corn slatop for these two sites (Table 6). Interestingly, prior studies using CLM have reported positive LAI bias which maybe due to higher slatop used in these studies compared to our study. For example, Peng et al. (2018) reported overestimation of maize LAI by CLM4.5 that had slatop set at 0.05, while Chen et al. (2018) reported overestimation of corn and soybean LAI using CLM4 with an slatop value of 0.07. In our study, a higher calibrated value of slatop for corn at US-Ro1 also resulted in higher LAI, however, for this crop-site LAI observations are not available for comparison (Figure S10a in Supporting Information S1). We have more slatop data than almost any other trait (Kattge et al., 2020) but models are highly sensitive to it (Shiklomanov et al., 2020) and thus even small data/calibration problems in this area cascades throughout the model. Analogous to LAI, simulated yield was also lower than the observed yield. Similar to our observations, lower corn yield was simulated using CLM4.5 (Peng et al., 2018) and CLM5.0 (D. L. Lombardozzi et al., 2020). Because of the close links between slatop, LAI, and photosynthesis, the start of the terrestrial chain of carbon processing, it is unsurprising that the underestimation of yield in this study and CLM are likely caused by the underestimation of above ground biomass (Peng et al., 2018).

Some of the limitations of the current study include performing site scale calibration and validation using limited QoIs. Future studies can enhance model performance by comparing against observations of above and below ground biomass; this is consistent with the argument of Keenan et al. (2012) who advocated for simultaneous calibration of models against diverse data streams. Another limitation of the current study is that for the regional simulations we did not account for how agricultural management practices of tillage, cover crops, crop residue management, and disease control can affect carbon and energy fluxes (Deryng et al., 2011; Dick et al., 1998). Estimating the impacts of these agricultural management practices on land fluxes and yield is beyond the scope of the current paper, but worth exploring in future studies.

Finally, and perhaps most fundamentally, ELM and most other global land models have plant and soil parameters that do not vary in space, time, or with forcing conditions such as light availability (Dohleman et al., 2009; Tian et al., 2015; Trócsányi et al., 2009; Van Esbroeck et al., 2003). Usage of constant parameters prohibits accurate estimation of various fluxes (T. Zhou et al., 2009) and crop yields (Osborne et al., 2015) and is a major limitation of the Earth System Models and a primary motivation for our analysis. This limitation can be addressed by modification of the model to read spatially variant crop parameters and generation of robust maps of parameters in space and time with well-defined errors. Finally, we need additional studies, similar to ours, that explore the magnitude of potential biases in agricultural ecosystems caused by spatially-invariant parameterizations in Earth System Models.

5 Conclusions

In this study, we implemented realistic agricultural management practice of crop rotation; calibrated and validated corn soybean rotation using multiple observations from the US-Midwest; and examined the impact of different parameterization schemes on carbon and energy fluxes.

We found that representation of agricultural management practice of crop rotation is important for studying the feedback between crops and climate and quantifying the impact of agriculture on energy fluxes and biogeochemical cycling. Our study shows that implementing crop rotation, and carefully calibrating crop parameters, improved estimation of site-level fluxes. We also found that the use of spatially variant crop parameters can have a large impact on carbon and energy fluxes. Such rigorous, spatially detailed approaches to crop modeling hold the potential to greatly improve flux estimation from agricultural regions.

Correctly representing the feedbacks between crops and climate is especially important for next generation ESMs that focus on improving the human-earth system interactions. Additionally, future studies focusing on calibrating corn and soybean for different regions or similar crops can optimize only the most sensitive parameters identified in this study for finding optimal parameter values. The reduced parameters can greatly reduce the surrogate models' dimensionality and improve their accuracy. It remains challenging to calibrate ESMs to multiple sites due to limited observational data availability, and our results emphasize the importance of observational networks such as FLUXNET (Baldocchi et al., 2001) and NEON (https://www.neonscience.org/) for ESMs.

Acknowledgments

This research was supported as part of the Energy Exascale Earth System Model (E3SM) project, funded by the U.S. Department of Energy, Office of Science, Office of Biological, and Environmental Research. The Pacific Northwest National Laboratory is operated by Battelle for the US Department of Energy under Contract DE-AC05-76RLO1830. Dr. Katherine Calvin is currently detailed to the National Aeronautics and Space Administration. Dr. Calvin's contributions to this article occurred prior to her detail. The views expressed are her own and do not necessarily represent the views of the National Aeronautics and Space Administration or the United States Government. Funding for the AmeriFlux data portal was provided by the U.S. Department of Energy Office of Science. The field study conducted at the US-UiC site was funded by the DOE Center for Advanced Bioenergy and Bioproducts Innovation (U.S. Department of Energy, Office of Science, Office of Biological and Environmental Research under Award Number DE-SC0018420), the Energy Biosciences Institute at the University of Illinois Urbana-Champaign and the Global Change and Photosynthesis Research Unit of the United States Department of Agriculture/Agricultural Research Service. We also thank two annonymous reviewers for their thoughtful comments that helped to significantly improve the manuscript.

    Data Availability Statement

    Data from the AmeriFlux network US-Ne3 (Suyker, 2022), US-Ro1 (Baker & Griffis, 2018), US-UiC (Bernacchi, 2022), US-Bo1 (Meyers, 2016), US-Br1 (Prueger & Parkin, 2016), and US-IB1 (Matamala, 2019) were used in the creation of this manuscript. The E3SM model is described in detail at https://e3sm.org/. The source code for ELMv2 is archived and made publicly available at https://github.com/E3SM-Project/E3SM/releases/tag/v2.0.0. All of the code supporting this paper is available at https://github.com/evasinha/Sinha-etal-2022-JGR-Bio and data supporting the paper is available at http://doi.org/10.5281/zenodo.7555458.