Joint Modeling of Crop and Irrigation in the central United States Using the Noah‐MP Land Surface Model

Representing climate‐crop interactions is critical to Earth system modeling. Despite recent progress in modeling dynamic crop growth and irrigation in land surface models (LSMs), transitioning these models from field to regional scales is still challenging. This study applies the Noah‐MP LSM with dynamic crop‐growth and irrigation schemes to jointly simulate the crop yield and irrigation amount for corn and soybean in the central United States. The model performance of crop yield and irrigation amount are evaluated at county‐level against the USDA reports and USGS water withdrawal data, respectively. The bulk simulation (with uniform planting/harvesting management and no irrigation) produces significant biases in crop yield estimates for all planting regions, with root‐mean‐square‐errors (RMSEs) being 28.1% and 28.4% for corn and soybean, respectively. Without an irrigation scheme, the crop yields in the irrigated regions are reduced due to water stress with RMSEs of 48.7% and 20.5%. Applying a dynamic irrigation scheme effectively improves crop yields in irrigated regions and reduces RMSEs to 22.3% and 16.8%. In rainfed regions, the model overestimates crop yields. Applying spatially varied planting and harvesting dates at state‐level reduces crop yields and irrigation amount for both crops, especially in northern states. A “nitrogen‐stressed” simulation is conducted and found that the improvement of irrigation on crop yields is limited when the crops are under nitrogen stress. Several uncertainties in modeling crop growth are identified, including yield‐gap, planting date, rubisco capacity, and discrepancies between available data sets, pointing to future efforts to incorporating spatially varying crop parameters to better constrain crop growing seasons.


Introduction
This study intends to extend the investigation of Xu et al. (2019), which focused on the transition of dynamic irrigation modeling from field to regional scales, by assessing the benefits and uncertainties in joint crop-growth and irrigation modeling in the context of capturing climate-crop-irrigation interactions in Earth system models (ESMs). It has been recognized that climate change and variability play a major role in affecting crop production Leng et al., 2016;Ray et al., 2015) from regional to global scales (Leng et al., 2016). Climate change has already impacted global agricultural production (Ray et al., 2019), and negative trends on crop yield per degree warming have been projected for major Agricultural management modifies surface water and energy balances, alters characteristics of land-atmosphere interactions, and hence impacts local and regional climate (Pielke et al., 2007). Furthermore, irrigation practices have been shown to increase humidity and decrease air temperature (Chen et al., 2018;Xu et al., 2019). This irrigation-cooling effect has shown to modify local environment, regional precipitation, and even reduce the chance of extreme heatwaves in the United States (Lu & Kueppers, 2015) and globally (Thiery et al., 2017).
To better understand the climate change, crop yield and freshwater nexus, as well as critical cropland-atmosphere interactions, it is important and necessary to improve the representation of dynamic crop growth and irrigation in ESMs. Recent efforts have been dedicated to implement crop growth dynamics and agricultural management into land surface models (LSM) within ESMs Leng et al., 2016;Levis et al., 2012;Liu et al., 2016;McDermid et al., 2017). For instance, crop growth models were introduced into the Community Land Model Version 4 with carbon-nitrogen cycle (CLM4CN) by Levis et al. (2012), which focused on the crop coverage in mid-latitude regions. The results showed improvement on simulating leaf area index (LAI), an index for crop growth, and summer precipitation, compared to the default setting of CLM4.5. This work also highlights the importance of accurate representation of the cropping calendar, as a "late-planting" sensitivity test improved the simulated annual cycle of net ecosystem exchange (NEE) in midwestern North America. More recently, a dynamic crop growth model was incorporated into the Noah with multiple-physics (Noah-MP, Niu et al., 2011) model and tested for two field sites in Illinois and Nebraska for corn and soybean (Liu et al., 2016). In Noah-MP-Crop, crop growth stages are solely dependent on growing degree days (GDD). The Noah-MP-Crop model improved the simulation of surface energy balance and LAI and provided reasonable estimates of biomass. While these works demonstrated widespread potential for agriculture-climate interactions in some key agroecology regions, it is still challenging to accurately represent crop-climate-hydrology interactions in general and specifically the spatial variations of crop-model parameters across various scales.
Similarly, irrigation parameterizations have been incorporated into various LSMs using the "soil moisture deficit" approach. For example, Ozdogan et al. (2010) used the soil field capacity as a threshold, below which irrigation is triggered, and calculated the irrigation demand from subtracting current root-zone soil moisture from field capacity. Lawston et al. (2015) applied this soil moisture deficit approach in the coupled Weather Research and Forecast (WRF) model and found the regional climate is highly sensitive to the irrigation method chosen (drip, flood, and sprinkler). Xu et al. (2019) used a similar approach to mimic sprinkler irrigation at the county level in the central United States. Instead of using a uniform value of field capacity, a spatially varying soil moisture threshold parameter is determined through regional calibration against the USGS water withdrawal data, which enables transforming model parameters from field to regional scale.
The above-mentioned crop-focused and irrigation-focused modeling approaches are inadequate to comprehensively address climate-crop-water interactions. In crop-focused models, a significant amount of irrigation water as important input to the surface-water-budget equation is neglected in semiarid croplands and will result in a warm/dry surface environment through land-atmosphere interactions, as well as loss in crop yield due to water stress. On the other hand, irrigation-only models fail to capture the feedback between irrigation water demand and crop growth stages. Therefore, regional irrigation modeling will benefit from the dynamic representation of crop heterogeneity, such as constraining simulated irrigation amount by crop planting/harvest date. Thus, it is necessary to perform joint crop-irrigation modeling in LSMs. Leng et al. (2016) provided the first joint modeling effort with crop and irrigation on large-scale in the United States, and optimized irrigation and fertilization practice in CLM4.5CN. The results showed that without optimization, the corn yield is much underestimated, due to the quick denitrification in CLM4.5CN previously reported by Oleson et al. (2013). The irrigation optimization increases yield only in the irrigated region and the fertilization optimization showed significant improvement in all regions. However, the improvement of irrigation scheme on crop yield under sufficient nutrition condition is not discussed. Moreover, uncertainties associated with crop model parameters, sparse agricultural data sets at both spatial and temporal scales, and even discrepancies between available data sets still remain unsolved.
Given the wide use of Noah-MP LSM in the community WRF model and in the operational National Water Model (NWM), it is important to understand and improve its capability in simulating concurrently crop growth and irrigation, because both processes affect surface heat and water-vapor fluxes (as lower boundary conditions in WRF) and streamflow. Therefore, the primary objectives of this study are to (1) assess the Noah-MP model's performance in joint crop and irrigation modeling, (2) investigate methods of transforming irrigation and crop modeling from field to regional scales, and (3) identify uncertainties and challenges in crop modeling in LSMs. We focus on two crops (corn and soybean) in this study, since they are the two crops currently represented in Noah-MP-Crop and are two major field crops in the central United States. Section 2 introduces the data required for model input and evaluation, and the Noah-MP crop and irrigation schemes. The model results for crop yield and irrigation amount are presented in section 3. The uncertainties in simulating crop yield are discussed in section 4. We conclude our findings in section 5.

Data Preparation
In this work, several agriculture management data sets are used to help constrain crop and irrigation models and to define the crop growing season, cultivated land fraction, and irrigated fractions. The planted area for corn and soybean are obtained from the 30-m CropScape data from the U.S. Department of Agriculture's (USDA) National Agricultural Statistics Service (NASS)/George Mason University (GMU) (https://nassgeodata.gmu.edu/CropScape/). This is a geo-referenced, crop-specific land cover data layer created for the contiguous United States using satellite imagery and has been supported by extensive agricultural ground truthing. The CropScape data set is originally derived from the planting frequency in 11 years (from 2008 to 2018) and used to calculate the fractional coverage of total cropland (relative to the grid cell's vegetated area; hereafter F crop ) and of each crop type (relative to the grid cell's total cropland area; F corn and F soybean ). In this study, the planting areas are determined on two criteria: (1) the F crop > 0.5; and (2) F corn or F soybean > 0.3, for corn and soybean, respectively. The planting area for these two crops and their planting fraction are shown in Figure 1.
The 2010 USDA report on usual planting and harvesting dates is used to define the length of growing season for corn and soybean. This survey reports the most active period of usual planting and harvesting dates for each state. In our study, the middle dates of planting and harvest windows are selected for the states within our study domain (see Figure 2). Although the middle dates for each crop in each state may not reflect the complex decision of actual planting and harvesting, it represents to some degree the spatial variation of planting and harvesting at state-level. The impacts of uncertainties in planting/harvesting dates on simulated crop yield and irrigation amount are discussed in section 3.2. For details of the planting and harvesting dates in each state, please see Appendix A.
For each year, the USDA NASS reports the average yields for various crops at the county-level over the United States (https://quickstats.nass.usda.gov/). These data are based on harvested yields, reported by a sample of farmers within each county, and verified with independent yield samples taken by USDA staff when the crop reaches maturity (FAO and DWFI, 2015). Therefore, the model simulated biomass (g/m 2 ) will need to be converted to standard yield (bushel/acre,bu/ac) to compare with the USDA county-level data, following the instruction (see http://www.ag.ndsu.edu/pubs/plantsci/crops/ae905w.htm): In Equations 1 and 2, 0.155 and 0.13 are the standard moisture content (15.5% and 13%) for corn and soybean, respectively. Harvested corn usually contain an initial moisture content greater than 15.5% (15.5~32%). For transportation and storage purpose, mechanical drying method is typically applied to reduce the initial moisture to the standard moisture. Two sources of weight loss are associated with this process: (1) the weight of the moisture loss (also known as "water shrink") and (2) the weight loss due to handling processes (Hicks & Cloud, 1992). The handling loss could range from 0.04% to 5.22%, depending on the initial moisture content and shrinkage loss. Therefore, the calculated dry mass losses tend to be variable among different growers. This uncertainty is worth noting when comparing the model simulated dry mass with standard yield in the USDA survey.
The irrigation locations are defined by the 500-m MODIS-based irrigation fraction map (Ozdogan & Gutman, 2008) and the critical irrigation threshold parameter, IRR_CRI, from Xu et al. (2019) is applied in this study (see Figure 3). IRR_CRI is a threshold parameter for the soil water content, below which the irrigation scheme will be activated and was calibrated at county-level in Xu et al. (2019). To evaluate the model irrigation amount, the 5-year report from the U.S. Geological Survey (USGS) on fresh water withdrawals for irrigation (http://water.usgs.gov/watuse/) is used to constrain and calibrate the irrigation parameters in the irrigation module (for details of irrigation modeling, see section 2.3 and Xu et al., 2019).
Two AmeriFlux sites with irrigated agriculture (Ne1 and Ne2 in Mead, NE; https://ameriflux.lbl.gov/sites/) are analyzed (Suyker, 2001a(Suyker, , 2001b. Ne1 is an irrigated continuous maize site, and Ne2 is an irrigated maize-soybean rotation site. Data collected at the AmeriFlux sites, including LAI, leaf mass per area (LMA), and harvested biomass, are used to evaluate the model output at these two locations with and without the irrigation scheme. Also, the measured leaf biomass per area (LMA; g/m 2 ) is equivalent to the Noah-MP-Crop parameter that converts biomass to LAI (BIO2LAI), which is assumed to be a constant.

Noah-MP-Crop Model
Noah-MP is a land component of the Weather Research and Forecast (WRF) model Skamarock et al., 2008;Yang et al., 2011), which has been widely applied in numerical weather prediction (NWP), regional climate and hydrology studies (  has been also used to simulate the land surface processes for streamflow forecasts in the National Water Model (www.water.noaa.gov/about/nwm).
The Noah-MP-Crop crop module consists of three components: a photosynthesis (PSN)-stomata scheme, a carbon allocation scheme, and a dynamic crop growth scheme. The leaf-level PSN rate and stomatal conductance are calculated based on the model of Farquhar et al. (1980) and Collatz et al. (1992) for C3 and C4 plants, respectively. However, there is only one set of PSN parameters for a generic C3 crop in the default Noah-MP. This simplified treatment does not represent corn (C4), a major productive species in central United States. Therefore, in this study, a set of C4 PSN parameters are adapted from a synthesis of literature and model sensitivity tests (see Appendix B).
Following a similar approach used in traditional crop models (Hybrid-Maize for corn, Yang et al., 2004; DSSAT for soybean, the Decision Support System for Agrotechnology Transfer, Jones et al., 2003), the dynamic crop growth model in Noah-MP-Crop uses the accumulated growing degree days (GDD) to determine eight plant growth stages (PGS, Liu et al., 2016): before seeding, emergence, initial vegetative, normal vegetative, initial reproductive, to maturity, after maturity, and after harvesting. Also in Liu et al. (2016), the dynamic crop growth parameters such as planting/harvest dates and GDD-based thresholds to determine plant growth stages are calibrated at two AmeriFlux sites in Bondville (Bo1), IL, for corn and Mead (Ne3), NE, for soybean.
Finally, the Noah-MP-Crop model allocates the assimilated carbohydrate to different parts of plant, depending on the growth stages. For each stage, the total carbohydrate from the PSN scheme is partitioned to the leaf, stem, root, and grain according to stage-function fraction parameters (from 0 to 1). For example, during the vegetative stage, more carbon is allocated to leaf relative to stem and root, while in the reproductive stage, most of the assimilated carbon is allocated to grain. Then, the simulated leaf biomass is converted to LAI based on a model parameter, BIO2LAI (or specific leaf area, SLA), in the following equation: The values of BIO2LAI are constants and are different for corn (0.015) and soybean (0.030) (Liu et al., 2016).

Irrigation Scheme
A dynamic irrigation scheme was integrated into Noah-MP and tested at field and regional scales without using the Noah-MP-Crop model (Xu et al., 2019). In this study, we adopt the same approach and couple it with dynamic crop growth, enabling two-way crop-irrigation interactions.
Plant photosynthesis and respiration processes are limited by water stress during droughts. Therefore, irrigation plays a critical role in both the water and carbon cycle through relieving water stress, especially for crops planted in arid and semiarid regions. In Noah-MP, the water stress function is plant-and soil-dependent and is determined by the integrated soil moisture availability (SMA) in root zones. As in Xu et al. (2019), the root-zone SMA is also employed as a basic irrigation trigger. For the irrigated cropland, the root-zone SMA is defined as the ratio of the current root-zone available soil moisture (current SM − SM wlt , wilting point) and nonstress soil moisture (SM ref − SM wlt ): The irrigated cropland is defined as the fraction within a cultivated grid cell (F irr−crop ) and takes the smaller value of F irr and F crop · F veg (cropland fraction relative to the model grid cell's total area) in Figure 3a: The irrigation triggering mechanism includes (1) F irr − crop > IRR_FRC (an irrigation fraction threshold); (2) within the growing season, defined by the planting/harvesting date map above; (3) SMA < IRR_CRI (soil moisture trigger; see Figure 3b); and (4) stop irrigation on rainy days. These criteria are checked daily, and if irrigation is triggered, the potential irrigation amount for the day (IWA) is computed to maintain where IRR_LIM is the daily maximum irrigation amount, which is limited by the capability of the irrigation system and water availability.
The above irrigation scheme would be executed for the crop type in each irrigated grid cell to obtain the irrigation water amount for corn (IWA corn ) and soybean (IWA soybean ).

Model Setup
The model domain is identical to the central U.S. domain in Xu et al. (2019). The model domain is 501 grids (north-south) × 601 grids (west-east) at 4-km resolution, covering the major part of the corn-belt in the central United States. The simulation period ranges from 1 October 1999 to 31 December 2004, covering five growing seasons. The atmospheric forcing data are from the North American Land Data Assimilation System (NLDAS, Cosgrove et al., 2003) forcing data set at 0.125°and hourly resolutions. The precipitation forcing is generated by combining observations from field stations, Stage IV radar retrievals from Next Generation Weather Radar System and satellite. A 10-year spinup period was used to ensure the soil moisture and temperature reach an equilibrium state. An elevation adjustment was applied to the surface pressure, longwave radiation, near-surface temperature, and humidity fields to account for topography differences between the model and NLDAS grids.
Six experiments were performed to assess Noah-MP's performance in joint crop-irrigation modeling (see Table 1). The first experiment (BULK) is a simulation with dynamic crop but without irrigation, in which a uniform planting and harvest date is applied in the whole domain. It adopts the default planting/harvest date (day of year) initially calibrated for corn in Bondville, IL, and soybean in Mead, NE (for corn: Julian day 111/300; for soybean: Julian day 130/280). The second experiment (BULK_IRR) is the same as BULK but with the calibrated dynamic irrigation scheme activated (Xu et al., 2019). The third (STATE) and the fourth simulation (STATE_IRR) are the same as the BULK and BULK_IRR but used the state-level planting and harvest date as shown in Figure 2. The BULK/BULK_IRR simulations were referred as the baseline simulations, and the difference between BULK/BULK_IRR and STATE/STATE_IRR represents the impacts of spatially varied planting/harvest date on crop yield and irrigation amount. The fifth (0.5N) and the sixth (0.5N_IRR) simulation are the same as STATE and STATE_IRR but reduce the nitrogen concentration by half. The difference between STATE/STATE_IRR and 0.5N/0.5N_IRR can be attributed to the impacts of nitrogen  concentration. Furthermore, comparing the results between STATE_IRR and STATE with 0.5N_IRR and 0.5N will demonstrate the impacts of irrigation under N-sufficient and N-stressed conditions. Figure 4 shows the county-level corn yields reported by USDA and results from the six experiments (5-year average from 2000 to 2004). Yield results from the BULK and STATE compare well with the USDA report in the magnitude and spatial pattern in the rainfed region but are underestimated in heavily irrigated regions such as Southeast Nebraska. Using the dynamic irrigation scheme in BULK_IRR and STATE_IRR reduces the yield bias in irrigated regions. The differences between the BULK and STATE will be further discussed in section 3.2. The 0.5N experiment significantly reduces yield for more than 60% of the domain due to nitrogen stress, which is similar to the CROP_DFLT scenario in Leng et al. (2016) for the fast denitrification in the default version of CLM4.5. In this case, using irrigation scheme (0.5N_IRR) has little improvement under nitrogen stress.

Model Performance
As for soybean yields shown in Figure 5, BULK and STATE show good estimate of yield in the major soybean production areas in the United States (MI, IL, IL, IO, WI, MN, SD), but markedly underestimate the yield in the irrigated regions such as NE, AR, and MS. In the 0.5N nitrogen-stressed condition, soybean yields are much under predicted for the entire domain. The dynamic irrigation scheme can help improve yield in the BULK_IRR and STATE_IRR simulation, but it does not show much impact under nitrogen stress condition in 0.5N_IRR. These results from corn and soybean suggest that the impacts of irrigation on yields in the irrigated regions are significant but only occur with sufficient fertilization supply.

Transition From Field to Regional Scale Crop Modeling
The second objective of this study is to transition crop modeling from field to regional scale by first exploring the use of spatially varying planting/harvesting dates for regional simulation. The impacts of spatially varying planting/harvest date on modeling crop yield and irrigation amount can be assessed by comparing the results from the BULK_IRR and STATE_IRR simulation, as shown in Figure 6. The bars are ranked by the yield from low to high in each of these states and the black lines represent the delayed days in planting date compared to the uniform planting date in BULK_IRR (111 for corn and 130 for soybean in Julian day). The delayed planting for each state implies a shorter growing season, which results in lower yields in STATE_IRR than in BULK_IRR for both corn and soybean. These reduced yields help improve the high bias of BULK_IRR in all states, except for South Dakota and Minnesota, where STATE_IRR underestimates in both corn and soybean yield. Figure 7 shows the impacts of delayed planting date on reduced yield (bu/ac/day) for corn and soybean. This impact of planting date on yield may be more complex than a linear relationship, but strong spatial variation exists across states on the sensitivity of modeled yield to delay in planting date. For both corn and soybean, a clear north-to-south gradient can be witnessed, as the impacts of planting date are strong in Northern states, such as Minnesota, Iowa, Wisconsin, and Michigan, while for soybean, the planting region in lower Mississippi river valley shows a clear dependence on planting day as well. Moreover, this north-to-south gradient of yield dependence on planting date also exhibits in each particular state as well. This is most obvious in Minnesota, Iowa, Illinois, and Indiana, for both corn and soybean, that the modeled yield in northern part of the states are more sensitive to delay in planting date than in the south.
In South Dakota, the model shows very little sensitivity to the planting date, suggesting the modeled yield may be impacted by water stress ( Figure S1 confirms this speculation that the underestimated yields in Eastern South Dakota and Western Minnesota are water-limited). However, the low irrigation fractions in these two regions (Figure 3a) suggested irrigation is not a significant water source for crop production. Therefore, we suspect that the perched shallow water table in the northern corn belt plays a role in supplying water for corn production (Rizzo et al., 2018). Note that the model applies a free drainage scheme for deep soil drainage and the complex two-way groundwater exchange processes are not considered in this study.
Transforming the planting date from uniform value at point scale to spatially varied at state-level could also influence the modeled irrigation amount, as the irrigation period is constrained by the crop growing season. Figure 8 shows the spatial distribution of USGS water withdrawal report at county-level in 2000 and the modeled irrigation amount from the BULK_IRR and STATE_IRR. The BULK_IRR, with uniform planting/harvesting date, overestimates irrigation amount compared to the USGS reported data, especially in the Lower Mississippi River Basin (LMRB). The largest overestimation in irrigation amount is over 100 mm and occurs in Poinsett, Arkansas, with USGS reported 459.2 mm and the BULK_IRR simulated 561.3 mm. The overestimated irrigation amount in the BULK_IRR has an intuitive explanation; the longer the growing season, the more water is needed to maintain soil moisture at the critical level. The scatter plot in Figure 9 for the irrigation amount from two simulations also confirms the overestimate of irrigation amount in the BULK_IRR, especially in the LMRB. After applying the spatially varying planting/harvesting date, the performance in STATE_IRR is improved compared to the BULK_IRR (RMSEs improve from 29.67 to 26.24 mm, and coefficient of determination, R 2 , increases from 0.89 to 0.92) in LMRB. The STATE_IRR also reduces irrigation amount in Nebraska as well, but not as much as

Journal of Advances in Modeling Earth Systems
in LMRB. In fact, the USGS county-level report represents an upper bound of the total water withdrawal, but the water is not necessarily used all for irrigation. Therefore, the model simulated irrigation amount should not exceed the USGS report. Hence, the STATE_IRR simulates less irrigation amount and provides better performance than the BULK_IRR. Figure 10 shows the LAI and grain mass at the two AmeriFlux sites (Ne1 and Ne2). STATE and STATE_IRR simulated LAI have good agreement in Ne1 for corn throughout the growing season, but underestimate LAI in Ne2 in 2002 for soybean. When it comes to the crop reproductive stage (grain production), the differences in yield between these two simulations are evident. The STATE simulation significantly underestimates corn yield at both sites, ranging from 31% to 80%, but using the irrigation scheme greatly improves corn yield at both sites.

Impacts of Irrigation on Crop Yield
As for the soybean yield, irrigation does not improve soybean yield as much as it did for corn yield, even with similar total irrigation amount. This is also noticed in Chen et al. (2018), as the increase in crop yield due to irrigation has a strong dependence on crop species. This may be attributed to the different biogeochemical characteristics between these two plants (corn is C4 and soybean is C3) in their water-use efficiency, including photosynthesis and respiration. Figure 11 shows the USDA yield data (5-year average) and the six simulations in this study, aggregated at state level. The comparison between BULK and BULK_IRR, and STATE and STATE_IRR in irrigated regions shows the improvement of yield with the irrigation scheme activated. The yield in BULK_IRR

Journal of Advances in Modeling Earth Systems
(156.5 bu/ac) is even double the amount than in BULK (74.61 bu/ac) for corn. The difference between BULK_IRR and STATE_IRR shows the impacts of prolonged growing season on overestimating modeled yield in BULK_IRR, due to the increase in modeled irrigation amount.
Moreover, the STATE_IRR and 0.5N_IRR represents the impacts of irrigation on crop yield under the conditions of sufficient and stressed nitrogen, respectively. The doubled irrigated yield in STATE_IRR (from 74.28 to 143.5 bu/ac) decreases under nitrogen stress condition (from 51.52 to 68.41 bu/ac) in 0.5N_IRR. This is similar to Leng et al. (2016) results, in which the irrigation scheme was applied to the default CLM4.5 run with fast denitrification rate. Thus, the irrigation impacts in such nitrogen-stressed conditions is limited. However, when the nitrogen concentration is unstressed, the impacts of irrigation manifest and improve crop yield. Table 2 presents the statistics from all simulations, including RMSE (in both bu/ac and relative to USDA report) and the coefficient of determination (R 2 ). These statistics confirm that under sufficient nitrogen concentration and state-level planting/harvest management, the application of a dynamic irrigation scheme (STATE_IRR) improves the modeled yield performance for both corn and soybean, reducing RMSE from 47.8% to 22.3% for corn and from 18.9% to 16.8% for soybean.

Discussion
Several uncertainties can contribute to the differences between simulated crop yields and the USDA report, including those associated with discrepancies between available data sets, crop yield gaps, and crop/irrigation model parameters, which are the subject of discussion in this section.

Yield Gaps Between Actual Yield and Modeled Potential or Water-Limited Yield
The yield potential (Y p ) is defined as the yield an adapted crop cultivar could achieve by alleviating all abiotic and biotic stresses through optimal crop and soil management (Lobell et al., 2009). Thus, Y p is achieved when management eliminates all limitations to crop growth and yield from nutrient deficiencies, water deficit or surplus, toxicities, salinity, weeds, insect pests, and pathogens. In our study, for irrigated corn and soybean, the model provides sufficient water and nitrogen; hence, the modeled yield should be close to Y p . For rainfed crops, the modeled yield is less than the potential yield due to water limitation (Y w , water-limited yield). The actual yield (Y a ) is collected from USDA NASS data set. Therefore, the relative yield gap (Y g ) can be calculated in

Journal of Advances in Modeling Earth Systems
Quantifying the yield gaps for each crop cultivar in different growing regions is still a research topic in the food production community. The Global Yield Gap Atlas (GYGA, www.yieldgap.org) provides estimates of untapped crop production potential on existing farmland based on current climate and available soil and water resources. GYGA's estimated Y g in the United States are 10~20% for irrigated corn and 20~30% for rainfed corn, respectively. In our study, Y g are calculated between USDA county-level data and our model simulations and listed in Table 3, which are 13~25% for irrigated corn and 17~28% for rainfed corn. These numbers are comparable to the numbers given by GYGA. However, the yield gaps for soybean are 15~32% for irrigated and 14~39% for rainfed soybean, which are higher than other studies (e.g., 9~24% in Egli & Hatfield, 2014;10~30% in Grassini et al., 2015), especially for the rainfed soybean, which agrees with the overestimation in IL, IN, and OH.

Uncertainties in Crop Model Parameters
The development of LSMs has expanded from its initial purpose to provide reliable lower boundary conditions for the coupled climate and weather models by including terrestrial biogeochemical processes, land use change, and dynamic vegetation growth (Bonan et al., 2011). Many LSMs adopt the Farquhar-Ball-Berry scheme to simulate the coupled leaf-level photosynthesis and stomatal conductance (Ball et al., 1987;Collatz et al., 1991Collatz et al., , 1992Farquhar et al., 1980;Niu et al., 2011;Oleson et al., 2013). Those biophysiological models require a variety of plant-specific parameters, such as the minimum stomatal conductance, respiration rate, and rubisco capacity (V cmx25 ), and they are usually measured under field experimental conditions. Bonan et al. (2011) reviewed the past literatures on PSN-stomata parameterization in LSMs and found that V cmx25 is the most critical parameter in modeling plant photosynthesis. This parameter characterizes the maximum carbon assimilation rate and is measured in laboratory conditions, given sufficient radiation upon leaf level and CO 2 concentration at 25°C. Bonan et al. (2011) concluded that the leaf-level measured V cmx25 , when scaled up to LSM model grid cell, could lead to higher photosynthetic rates when nitrogen was nonlimiting (such as for cropland systems). Furthermore, the V cmx25 parameter is little constrained and remains model dependent over LSMs. Table B1 provides a synthesis of the parameters used in several studies. The wide range of V cmx25 values (from 30 to 101 μmol m −2 s −1 ) and different treatments of product-limiting pathway in PSN calculation (K p ) demonstrate a significant uncertainty in specifying the model-dependent PSN parameters. Hence, calibration of the PSN parameters becomes critical, but has been usually conducted at field scales using measurements of moisture and carbon fluxes. The Noah-MP-Crop model (Liu et al., 2016) uses the generic crop PSN parameters, which do not distinguish C3 and C4 crops. To incorporate corn-specific PSN parameters into Noah-MP-Crop parameter table, we performed a calibration for C4 corn using the LAI and biomass data in the AmeriFlux Bo1 site in Bondville, IL. The calibrated values are listed at the bottom row of Table B1, noted as "Adjust", meaning they are calibrated and subject to adjustment. The main result of the calibration is to reduce overestimated rain-fed corn yield by reducing V cmx25 from the default value

Journal of Advances in Modeling Earth Systems
(80 μmol m −2 s −1 ) to a lower value (60 μmol m −2 s −1 ). The calibration results are presented in Figure S2. As for soybean, the default crop parameters for C3 was used in this study. He et al. (2019) provides a global rubisco capacity map from satellite-observed solar-induced chlorophyll fluorescence (SIF) record. Through data assimilation methods, the 11-year record of SIF shows both spatial and temporal variation of V cmx25 in world's major crop production regions. Future efforts of incorporating the spatial map of V cmx25 into ESMs and LSMs would be highly useful to address the wide range of this model parameter.

Crop Model Parameter Uncertainties-Planting/Harvesting Management
Representing dynamic crop phenology in LSMs is critical for predicting the energy, water, and carbon budgets in croplands and may even influence the atmospheric boundary layer, especially in areas with large cropland coverage (Betts, 2005;Ma et al., 2012). In some LSMs, the determination of planting and harvesting, as well as plant growth stages are calibrated against field data. Therefore, these calibration efforts are local and there are few studies quantifying the impacts of planting on simulating crop phenology over a large region. For example, in the CLM4-Crop, the planting is activated by three temperature thresholds, a 20-year averaged GDD threshold, a threshold of 10-day running mean of air temperature, and a threshold of daily minimum temperature (Levis et al., 2012). Chen et al. (2018) evaluated the CLM4-Crop over multiple AmeriFlux sites over the U.S. corn belt and found there is an early season overestimate of LAI, due to a too-early start of planting. A modified simulation with locally accurate planting dates showed improvement in simulating energy and water fluxes, as well as the NEE.
In Noah-MP-Crop, the planting and harvesting date are prescribed parameters to reflect the spatial and yearto-year variation of planting/harvesting date for Bo1 and Ne3 sites in Liu et al. (2016). In this study, the BULK_IRR simulation with an early and spatially invariant planting date overestimated the crop yield and irrigation amount for corn and soybean, consistent with the results of Chen et al. (2018). By contrast, the STATE_IRR simulation with spatially varying and delayed planting dates effectively mitigated those overestimations ( Figure 6). Figure 7 shows that the northern states in the corn belt are relatively more affected by delayed planting date than the southern states, and this north-to-south gradient is evident within each state as well.
Although the state-level planting/harvesting date applied in STATE_IRR represented to some degree of their spatial variations, uncertainties still exist. The USDA usual planting/harvesting date report gives the most active window for planting and harvesting through the survey of last 20 years. In the STATE and STATE_IRR simulation, the middle date of the window time is selected for each state. However, applying the single planting/harvesting date on state-level is still unrealistic. Figure 7 shows the spatial variations of the modeled crop yield sensitivities to delay in planting date and the range of these crop yield responses are calculated in Table 3.
To better constrain the crop growing seasons, it is necessary to incorporate the spatially detailed crop calendars. For example, the planting and harvesting windows can be dynamically modeled based on field workability, considering snow cover and rainfall, and crop biological requirement for heat and moisture (Iizumi et al., 2019). Dynamically modeling the crop calendar will likely to reduce the uncertainties of specifying crop growing seasons in future crop model development, especially in regions where agricultural management data are sparse. Figure 12 shows the reciprocal of measured leaf mass per unit area (LMA, g/m 2 ) from Ne1 and Ne2 from 2001 to 2007, which demonstrates significant in-season variations for both corn and soybean. For corn, this reciprocal decreases from 0.03 at the early growing stage to 0.01 m 2 /g at the end of the growing season. This characterizes a general corn leaf growth feature: extensive leaf growth (larger LAI) at the beginning of the growing season with small amount of mass, and later growing thicker (more mass) with slight increase in LAI. The inverse of LMA for soybean has less variability and the values are generally higher than for corn during the growing season (ranging from 0.018 to 0.029 m 2 /g).

Crop Model Parameter Uncertainties-Convert Leaf Mass to LAI
The ranges of LMA are listed in Table 3 as compared to the default constant value of BIO2LAI in Noah-MP-Crop that has the same physical meaning as the 1/LMA and is used to convert the prognosed leaf mass to 10.1029/2020MS002159

Journal of Advances in Modeling Earth Systems
diagnosed LAI. BIO2LAI is set as constants for corn (0.015) and soybean (0.030). Such a constant conversion coefficient is used in other LSMs too, for example, the specific leaf area parameter (SLA) in CLM (Oleson et al., 2013). The substantial seasonal variations of 1/LMA in Figure 12 points to the challenges of using a constant BIO2LAI throughout the entire crop growing season, and a time-varying conversion coefficient is needed in future model development. Table 3 summarizes the aforementioned uncertainties and provides the default values in Noah-MP-Crop and the ranges of uncertainties of three parameters: yield gaps (between USDA-report actual yield and modeled yield), model parameters (V cmx25 , planting date, and BIO2LAI). The uncertainty associated with mechanical drying after harvest mentioned in section 2 is also included in Table 3.

Conclusion
This study evaluated the performance of Noah-MP-Crop's joint modeling of crop and irrigation at in the central United States. By incorporating spatial data sets of high-resolution crop and irrigation fraction, and state-level planting/harvesting date, the crop model can be applied to regional scale. The impacts of irrigation on crop yield are assessed from field to regional scale as well as under nitrogen sufficient and stressed conditions. Also, several uncertainties including model parameters, yield gaps, and discrepancies between available data sets are assessed.
The results showed that in the U.S. corn-belt the bulk simulation (with uniform planting/harvesting date and no irrigation) captured the magnitude and spatial variation of corn yield against the USDA county-level report (RMSE = 28.1% for the whole domain). But in the heavily irrigated region, for example, in Nebraska, the yield was much underestimated (RMSE = 48.7% in the irrigated region). Adding irrigation modeling capability effectively improved yield simulation over irrigated region (RMSE = 23.1%). The RMSEs for soybean over the whole domain and irrigated region are 28.4% and 20.5%, respectively. The irrigation improvements on soybean yield are relatively small compared to that for corn. Noticeable overestimation of yield for corn and soybean still exist in Northeast of the domain in Indiana and Ohio, which may be attributed to early planting biases and the yield gap between actual yield and modeled yield.
Homogeneous transitioning of the crop model parameters from field to regional scale, two simulations with state-level planting/harvesting date were conducted. These spatially varied planting/harvesting dates were in general later than the uniform planting dates. The delayed planting dates across states resulted in

10.1029/2020MS002159
Journal of Advances in Modeling Earth Systems reduction in modeled yield and irrigation amount, which improved the overestimated yield bias associated with early planting bias. A spatial analysis also showed that the modeled yield in northern states was more sensitive to delayed planting than in southern states for rainfed corn and soybean. This north-to-south gradient was evident within each northern state as well (IL, IN, IO, MN, WI). This indicates that using one single value for planting/harvesting date for each state is still an oversimplified assumption, which is inadequate to address the complex decision of agricultural management. Comprehensive data sets of cropping calendar at high-resolution are needed for future crop model development.
Dynamic modeling of crop growth and irrigation application is challenging, and there are many uncertainties. Several sources of uncertainties were identified, including yield gaps, model parameters associated with photosynthetic rubisco capacity and planting date, and discrepancies between different observation data.
The rubisco capacity (V cmx25 ) is a significant source of uncertainty, and we calibrated it according to single-point simulation in Bondville for corn (C4 corn).
Fertilization has been identified as a source of uncertainties in previous studies (Leng et al., 2016). In this study, it was assumed that the crops are not nitrogen-stressed. To investigate the impacts of irrigation on crop yield under nitrogen-stress, two sets of additional simulations are conducted which halved the nitrogen concentration. When nitrogen concentration is reduced to half, nitrogen stress could cut crop yield by 48.6% and 73.8% for corn and soybean, respectively (comparing 0.5N with STATE). The irrigation improvements on crop yields under nitrogen stress are restricted (comparing 0.5N and 0.5N_IRR), with 32% and 1% increase for corn and soybean, respectively. These numbers are much less than under sufficient nitrogen condition (comparing STATE and STATE_IRR, 93% for corn and 27% for soybean). This concludes that the manifestation of irrigation improvement on crop yield relies on sufficient nitrogen concentration.
The present study contributed to the knowledge of simulating crop yield and irrigation water amount in one of the world's most productive agriculture regions and investigated the impacts of irrigation on crop yields. The irrigation effects on crop yield under no nutrition-stress condition is addressed in this study, which was often ignored in previous research. However, other sources of uncertainties arise from crop model photosynthesis and phenology parameters, yield gap, and unit conversion. To mitigate these uncertainties, we demonstrated that calibrating the crop rubisco capacity parameters and constraining the growing season with spatially varying planting/harvesting date can improve crop simulation results. Finally, future efforts should be dedicated to incorporating spatially detailed rubisco capacity parameters and crop calendar to better constrain the crop growth dynamics.