Perspectives on the Future of Land Surface Models and the Challenges of Representing Complex Terrestrial Systems

Land surface models (LSMs) are a vital tool for understanding, projecting, and predicting the dynamics of the land surface and its role within the Earth system, under global change. Driven by the need to address a set of key questions, LSMs have grown in complexity from simplified representations of land surface biophysics to encompass a broad set of interrelated processes spanning the disciplines of biophysics, biogeochemistry, hydrology, ecosystem ecology, community ecology, human management, and societal impacts. This vast scope and complexity, while warranted by the problems LSMs are designed to solve, has led to enormous challenges in understanding and attributing differences between LSM predictions. Meanwhile, the wide range of spatial scales that govern land surface heterogeneity, and the broad spectrum of timescales in land surface dynamics, create challenges in tractably representing processes in LSMs. We identify three “grand challenges” in the development and use of LSMs, based around these issues: managing process complexity, representing land surface heterogeneity, and understanding parametric dynamics across the broad set of problems asked of LSMs in a changing world. In this review, we discuss progress that has been made, as well as promising directions forward, for each of these challenges.


Introduction
The land surface is the only part of the Earth system that is directly experienced by the majority of humans, terrestrial animals, and plants. Land surface processes mediate the majority of the impacts of climate on human societies and ecosystems, and accurate representation of land surface processes is critical for our understanding of how climate and climate change actually affect living systems. Land surface models (LSMs) are numerical models that solve the coupled fluxes of water, energy, and carbon between the land surface and atmosphere, within a context of direct and indirect human forcings and ecological dynamics. LSMs are arguably the most sophisticated tools that society currently possesses for predicting how the conditions for life on the surface of the Earth will change in the coming years, decades, and centuries. The scope of land surface modeling activities naturally encompasses a huge set of overlapping and interconnected disciplines relevant to this problem.
In this paper, we attempt to provide a high-level illustration of a set of different classes of challenges that arise from such a complex and high-dimensional activity. We further indicate, where appropriate, promising approaches around which one might organize the development of tools that can predict the complex and heterogeneous functioning of the land surface under the radically altered climatic, ecological, and societal conditions anticipated by Earth system projections.
Land surface models were originally developed (and thus continue to be primarily supported) by atmospheric/climate modeling and forecasting activities that demand physical boundary conditions in terms of energy partitioning, surface roughness, and albedo, to represent the influence of the land on meteorological processes. As applied to the global climate change problem, two key model results set the LSM community on its current trajectory: (1) the prediction that plant biophysical responses to elevated CO 2 could have an appreciable effect on the global climate itself (Sellers et al., 1996), and (2) that coupling of climate and carbon cycle could substantially strengthen the rate of global warming (Cox et al., 2000). The need for LSMs to quantify such biogeophysical and biogeochemical feedbacks (respectively) to the climate system has formed the basis of their recent development, but increasingly, questions pertaining to the impacts on the land surface itself have attained a higher profile.
State-of-the-art LSMs (e.g., Decharme et al., 2019;Wiltshire et al., 2019;Yokohata et al., 2019) typically provide a set of prognostic variables related to land-mediated feedbacks on global biogeochemical cycles. In particular, the terrestrial carbon cycle, by partially controlling what fraction of CO 2 that humans emit remains in the atmosphere, has a role in determining the transient climate response to emissions and the remaining carbon emissions budget compatible with a given climate goal (Matthews et al., 2018). In addition, LSMs predict changes in the biophysical function of the land surface as climate and ecosystems change and thus how the land interacts with both the atmosphere and with rivers and downstream ecosystems. Lastly, LSMs provide information on risks to human societies and natural ecosystems associated with future climate scenarios, including crop productivity, heat waves, urban climates, the severity and frequency of fire and other disturbances, flooding, ecosystem productivity, permafrost and land ice status, and health and freshwater availability.
Through time, representations of numerous processes that are known to impact the dynamics of systems relevant to these questions have been incrementally added to LSMs. As a result, land surface models have expanded from their initial simple biophysical configurations (Sellers et al., 1986), to include representations of soil moisture dynamics, stomatal functioning, land surface heterogeneity, surface hydrological processes, plant and soil carbon cycling, dynamic vegetation distributions, fire, urban environments, land cover and management, nitrogen cycling and crops (Lawrence et al., 2019, Figure 1), and latterly plant demographic processes Sato et al., 2007;Weng et al., 2017), phosphorus cycling, (Goll et al., 2017;Reed et al., 2015;Yang et al., 2014), and plant hydraulics (Joetzjer et al., 2018;Kennedy et al., 2019). This ever-widening scope of land surface models may be unavoidable, due to the interrelated nature of the questions being asked of them. For example, the processes that govern carbon cycle feedbacks are highly affected by both biophysical feedbacks in the Earth system and by land use decisions that are in turn affected by climate impacts on human societies. Climate change impacts such as drought and fire are mediated by plant biophysical responses to elevated CO 2 , which are themselves impacted by limitations imposed by nutrient limitations on growth. Changing ecosystem boundary conditions impacts the composition and thus biogeophysical and biogeochemical functionality of plant communities, and thus all these processes benefit from being considered within the context of dynamic and/or demographic vegetation.
Arguably, the inclusion of process representation in land surface models is accelerating, driven by the needs of various different user communities (hydrologists, biogeochemists, ecologists, atmospheric scientists, and crop modelers) and by arguments put forward that the overall biospheric feedbacks are themselves importantly affected by a great number of interacting systems, including, for example (at the time of writing), insect dynamics and impacts (Dietze & Matthes, 2014;Huang et al., 2019), vegetation sink limitations to growth (Fatichi et al., 2019), soil microbial dynamics (Wieder et al., 2013), subcanopy turbulence , leaf mesophyll processes , and polygonal tundra parameterizations (Pau et al., 2014).
At the same time, both land surface models and the atmospheric models to which they may be coupled are refining their spatial resolution, as enabled by new data sets and higher computational capabilities. A decade ago, Wood et al. (2011) argued that achieving such increases into the 10 2 -10 3 m resolution range was itself a grand challenge of land surface modeling, requiring increases in both the model capabilities and new data sets to drive and test such models. In response, Beven and Cloke (2012) argued that, while such increases in resolution should in principle allow for better simulations, the deeper problem lay with the epistemic uncertainty of how to represent any given process and how to capture the effects of smaller-scale unresolved processes, at any given scale. As the scope of land surface models has increased, and alongside computational advances that have largely allowed the hyperresolution goal to be attained (Bierkens et al., 2015), the questions of epistemic uncertainty and unresolved heterogeneity have grown in importance.
Rather than focus our discussion here on the arguments for and against inclusion of specific new processes in land surface models, or whether increasing spatial resolution by itself will qualitatively change the nature of LSM simulations, we instead focus on three broader challenges that integrate across model components, namely: 1. Managing and understanding the process complexity of LSMs 2. Heterogeneity and the dimensionality of the land surface 3. Projecting the temporal and spatial dynamics of model parameters Within each of these three "grand challenges" we describe the nature of the challenge, illustrate ongoing developments, and propose pathways within which research and model development might best be structured to meet the important but comprehensively difficult task of predicting the future of the terrestrial surface and biosphere.

Process Complexification
The wide variety of processes that interact to form the terrestrial system, and the depth of complexity present in every one of these processes, together create a deep obstacle to creating tractable models of the land surface. The increasing complexity of land models reflects both the tendency of scientists to focus on their own particular areas of interest and expertise, as well as the reality that the Earth is in fact complex and that the details of a great number of processes do in fact matter. But at the same time, the scope and complexity of some modern land surface models have reached the point that no individuals are able to comprehensively understand all facets of any one model. Further, a majority of model development teams (which are typically situated within and primarily funded by Earth system modeling centers) struggle to meet all of the demands placed on modern LSMs. has proceeded from simple "bucket" representation (Manabe, 1969), through 1-D Richards equation (Bonan, 1996;Cox et al., 1999), to 3-D variably saturated flow models that span from the soil through plant tissues (Bisht & Riley, 2019). The representation of biogeochemistry has proceeded from the small set of equations required to represent photosynthesis at the leaf scale (Dickinson et al., 1991), through full carbon cycle models (Dickinson et al., 1991), to multiple coupled nutrient models Thornton et al., 2007;Y.-P. Wang et al., 2015;Zaehle & Friend, 2010). The representation of plant community ecology has proceeded from the static plant functional types (Bonan, 1996;D. M. Lawrence et al., 2011;Zeng et al., 2002), through mean individual dynamic models with simple rules governing competition (Arora & Boer, 2006;Cox, 2001;Sitch et al., 2003), to models that resolve physiological processes at the canopy level and implicitly downscale to population demography using self-thinning or allometric scaling relations (Argles et al., 2019;Bellassen et al., 2010;Haverd et al., 2013), and to demographic or individual-based models with resolved competition between cohorts or individual plants Longo et al., 2019;Moorcroft et al., 2001;Sakschewski et al., 2015;Weng et al., 2017). The shift toward representing the agents of change has led groups to represent microbial types and their population dynamics in soil biogeochemical models as well (Treseder et al., 2012;Wieder et al., 2013). The role of both natural and anthropogenic disturbance, missing in early land surface models, has been a major focus of developments in order to represent the many direct effects that humans have on modifying the land surface (P. J. Lawrence et al., 2012;Nabel et al., 2019;Pongratz et al., 2018;Shevliakova et al., 2009;Yue et al., 2018). Many further dimensions of process complexification exist as well including canopy radiative transfer, trace gases, fire, permafrost, boundary layer turbulence, and rivers.
While the arguments behind all of these process developments are sound, the historical development pathways by which process complexification has proceeded in any given land surface model have been largely ad hoc and based on a collection of institutional, geographic, and individual preferences and interests. As a result, the representation of any given process across models is extremely heterogeneous: Some models may represent in great detail a given process that is entirely absent in peer models. This makes the comparison of model predictions and projections difficult and frequently uninformative (Clark et al., 2011), a feature which was noted in early model intercomparison efforts (Koster & Milly, 1997) and remains true today. Complexity also creates problems for those wanting to bring the evolving understanding of a given process into models: How do we weigh the costs and benefits of a given increase in complexity?
A frequently proposed strategy to dealing with the problems that arise through complexification is to pursue a "hierarchy of complexity" (Claussen et al., 2002) wherein parameters of simple(r) models are diagnosed from the aggregate behavior of complex models. Such approaches are enormously valuable, and show up across disciplines, but are generally themselves reflective of a particular perspective, because the specific "simple model" chosen is dependent on the question being asked and conditional on all the other processes deemed to be outside the hierarchy of complexity. To a hydrologist, the simple model may be a water balance model, while to a community ecologist the simple model may be the growth rate of trees as conditional on their size. How can we approach the complexity problem in a way that maintains sufficient flexibility to allow multiple different ways of simplifying things across the wide set of processes that comprise land surface modeling?

Modular Complexity as a Strategy
As land surface models themselves emerged from the introduction of interfaces between the land and the atmosphere in early climate models (Polcher et al., 1998), a possible solution to the complexification problem is to take a more modular approach to the representation of processes in the land surface, in order to allow the scaling of complexity and process representation across many dimensions ( Figure 2). The crucial requirements of such a modeling system are (1) the ability for it to represent a given process (or cluster of processes) in multiple ways, recognizing the epistemic uncertainty associated with any choice of representation as well as the possibility of very different degrees of complexity (from highly resolved process representations to highly simplified "stub" representations such as representing a given process as having a fixed value), and (2) to not necessarily assume which among a set of potential processes are the ones to be simplified or replaced, nor which aspects of a given process are the ones that a simpler configuration would be dependent on. For example, a simplified configuration focused on vegetation dynamics may want to ask for growth and mortality rates from a simplified representation of plant physiology, while a Processes, and sets of processes, are represented as boxes in the diagram, with information connections represented as arrows. All processes-though here shown only for stomatal conductance-are intended to allow alternate specifications, including possibly multiple hypothetical process realizations, empirical or machine learning-derived formulations, and/or simplified stub or null representations to allow for holding a given process constant while other processes vary. meteorology-focused simplified configuration may only require stomatal conductance and optical properties from the representation of vegetation (Laguë et al., 2019). Such efforts are already underway for subsets of the land surface modeling scope such as basin-scale (Clark, Nijssen, et al., 2015) or site-scale (Coon et al., 2016) water and energy budget models, leaf photosynthesis models (Walker et al., 2018), offline models of forest structure (Farrior et al., 2016) (Moore et al., 2018), and soil biogeochemical "testbed" models (Wieder et al., 2018) some of which are included as schematics in Figure 2b, but more effort is required to generate overarching frameworks that can encompass these various themes.
The difficulty of designing a model architecture with this ability in mind is that the boundary conditions for any one specific process (or cluster of processes) tend in practice to be very fluid. As representation of say, fire, tree mortality, or soil respiration evolve over time, new variables need to be passed from one part of the model to another for each iteration of the process representation (e.g., one fire model might need information on the status of a single pool of coarse woody debris, whereas its successor may need several size-structured pools). Any such coupling strategy must thus be specifically designed to accommodate a wide set of specific process representations and their variable boundary conditions at the outset, as well as flexibility in the numerical approach to creating the coupling. Thus, the design of interfaces that are robust to changing properties of submodules is a high priority. A further difficulty is in deciding how to aggregate processes into higher-level submodels: While it may be straightforward to define alternate hypotheses for, say, models of stomatal conductance or within-leaf carbon assimilation (Walker et al., 2018), other sets of processes may not be as unambiguously delineated.
In principle, such an approach to land surface modeling may be much more powerful than current approaches that use "ensembles of opportunity" to characterize structural uncertainty across a wide range of model predictions. The key weakness with contemporary model intercomparison projects such as C4MIP , TRENDY (Le Quéré et al., 2018), MSTMIP (Huntzinger et al., 2013;Schwalm et al., 2019;Zscheischler et al., 2014), ISIMIP (Nishina et al., 2015) and others is the inability to understand how process and parameter uncertainty maps onto uncertainty in the relevant model projections. Explanations that attempt to identify the largest variation in model projections in terms of specific processes such as nutrient or land use dynamics  are useful in suggesting what may be driving the models, but such approaches are currently limited by the poor control on structural and parametric variation between models. The more detailed assumption-centered approach of attributing divergences between models and experiments described by Medlyn et al. (2015) allows a better estimate of how structural differences lead to model divergences (see also De Kauwe et al., 2014;Walker et al., 2015;Zaehle et al., 2014); however, even in that framework the many model differences other than the specific assumptions being tested (e.g., as enumerated in Rogers et al. (2017)) add a degree of ambiguity to the interpretation. Schwalm et al. (2019) attempt a post hoc linkage between various components of LSM structure within the MsTMIP archive with model skill scores but still emphasize that their analysis undersamples the potential range of model configurations. Building intercomparison efforts around model frameworks that use a modular complexity approach, as has been explored in specific models around specific aspects of process representation, such as the stomatal conductance example shown in Figure 2 (Franks et al., 2018;Knauer et al., 2015), but expanded and systematized such that each model system could explore all aspects of the structural uncertainty questions investigated with a breadth comparable or greater than current MIPs, would provide a much firmer basis for attributing and understanding differences in model behaviors. Such an approach would allow the community to move away from its current dependence on ensembles of opportunity and toward one built upon ensembles of purpose.
One further potential benefit of such an approach is that model components could be developed collaboratively. Given that the majority of models in the CMIP6 intercomparison do not at present represent the key processes relevant to biogeochemical feedbacks (nutrient cycling, fire, and dynamic vegetation) (Arora et al., 2019), we argue that the current system, with its intrinsic massive duplication of effort, could be improved if certain components were shared across models, with international teams of the relevant process-domain experts contributing to the representation of individual modules. Modern online collaboration and communication tools should make such "horizontal" division of effort more tenable for a new generation of land surface modelers. The CICE consortium, an international team of sea ice model developers (Roberts et al., 2018), provides an excellent example of this modus operandi.
A notable barrier to developing a culture where model frameworks are deployed by default using parametric and/or structural ensembles is the "one-model-one-vote" format of the CMIP exercise (Eyring et al., 2016), and other MIPs, wherein it is expected that single releases of each Earth system model (ESM) provide either one integration, or an ensemble of integrations that cover the chaos-induced natural variability in the climate system by slightly modifying initial conditions (Kay et al., 2015). Atmospheric and ocean processes, in particular, are known to be highly dependent on initial condition uncertainty, but this focus is somewhat misplaced in the context of land surface models, where parametric and structural variation is apparently dominant over initial conditions at all timescales longer than a few years (Bonan & Doney, 2018). Shifting some substantial fraction of computational resources away from initial-condition-focused approaches, and toward structural and parametric uncertainty approaches, is thus also required to better represent the total uncertainty inherent in land surface projections.
A further advantage of such a modular complexity framework may be to embed purely empirical approaches, such as from machine learning, within a given model process. Such approaches may be useful in solving two distinct sets of problems. The first is that, because of the large scope of land surface modeling, several aspects of the models have little theoretical basis and are currently based on empirical or ad hoc approaches. Some of these processes, such as phenology (Dahlin et al., 2015(Dahlin et al., , 2017Taylor & White, 2019) and hydrology (Lapeyre et al., 2019) are the subject of a large number of observations, and thus may be amenable to machine learning approaches. The second set of problems are ones where we may have a detailed process-level understanding, but where solving these equations are computationally too expensive for a given application. In this case, surrogate or reduced order models, based on machine learning approaches that have been trained on the full process representation models, may allow for higher fidelity solutions than current, purely process-driven approach used across LSMs. Given the increasing emphasis on machine learning approaches and the successes of machine learning in solving problems in ESM (Gentine et al., 2018) or offline hydrologic model (Fang et al., 2017;Shen, 2018) behavior, designing models with an emphasis on modular complexity to allow for such hybrid approaches is a crucial challenge in modeling the land surface.

Horizontal Heterogeneity
The boundary conditions of the land surface change as a function of the climate, which is typically provided to LSMs as gridded products, either from Earth system models or climate "reanalysis" data products (Sheffield et al., 2006). Even, however, at the highest resolutions foreseen using modern climate models (1-10 km), land surface processes can be notably variable (Fox et al., 2008;Lundquist & Dettinger, 2005;Tai et al., 2017) within a single "climatic" grid cell. Simulating areas with disparate functionality as a single homogenous entity can lead to numerous errors in prognosis, particularly on account of strong nonlinearities that are common features of land surface processes (Sellers et al., 2007). One approach to resolving subgrid heterogeneity is to further increase the resolution of the model. This approach was advocated from a hydrological perspective by Wood et al. (2011) who described the implementation of "hyperresolution" models operating down to a scale of~100 m as a "Grand Challenge" in hydrology. While the resolution at which land surface models can be run continues to increase, the majority of LSMs can be run on spatial grids of arbitrary resolution, and their typical deployment remains at much larger spatial scales (0.5-2°) in the context of simulating global climate feedbacks and impacts. However, such resolutions only solve the heterogeneity problem where the length scale of the dimension of variation is of the same order of magnitude as the grid cell size. In practice, many elements of landscape heterogeneity, including forest gaps and microtopography manifest at smaller scales (Aas et al., 2019;Bonan et al., 1993;Thomas et al., 2008). In response to Wood et al. (2011), Beven and Cloke (2012) noted this point, as well as the concern that "hyperresolution" does not address the numerous epistemic uncertainties remaining.
To allow for operation across a multitude of scales, most modern land surface models, for example, SURFEX (Masson et al., 2013), JULES (Burton et al., 2019); CLM5 ; CABLE (Haverd et al., 2018), ORCHIDEE, (Naudts et al., 2014), and JSBACH (Mauritsen et al., 2019), operate using some sort of subgrid "tiling" system, which disaggregates pools and fluxes of relevance (water, energy, carbon, and nutrients) along predetermined axes of variation that capture various properties of the physical surface. In modern land surface models, the elements of heterogeneity most often captured include lakes, rocks, ice and urban areas, as well as "soil." Typically, the soil area is divided into plant functional type (PFT)-based tiles, potentially also including crop types, as well as bare soil. Tiles are typically defined as spatially implicit aggregations of all of the area within a grid cell belonging to a particular land surface category.
As process complexity grows, however, the need to represent fine-scale gradients in land surface heterogeneity grows with it. An ongoing theme of land surface model development is the proposal of additional axes of variation which might be considered necessary to accurately represent particular land surface processes.
For example, Aas et al. (2017) illustrate the importance of representing snow-covered and snow-free parts of an alpine landscape on runoff characteristics, illustrating that the melting of an area-averaged snowpack can be delayed by 2-3 months compared to a disaggregated and variable-depth snowpack. Sellers et al. (2007) argue that, on account of the nonlinearity between soil moisture, plant water stress and evapotranspiration, that landscapes might be binned according to soil wetness, and the bulk evaporative stress functions calculated as an area average across bins. They, and latterly Baker et al. (2017) show that area averaging of soil moisture (to reflect the patchiness of time since the last rainfall event) substantially reduces model responsiveness of evapotranspiration to rainfall events. Fan et al. (2019) and the hydrology community more generally (Clark, Fan, et al., 2015), have argued that landscapes might be tiled according to "hydrological response units" (HRU) which attempt to capture the dynamics of lateral water drainage, and thus the nonlinear impacts on hydrological and ecological processes downstream from the simulated topographic convergence of moisture. Such schemes define HRUs in terms of fractions of a grid cell, and thus can represent bulk properties of hydrological variation without increasing computational costs by orders of magnitude. Subin et al. (2014) illustrate the impacts of subgrid representation of hillslope hydrology in the GFDL model, noting, in particular, an increase in soil carbon resulting from saturated lowland areas. Swenson et al. (2019) report the implementation of an HRU approach into the Community Land Model v5, illustrating that the strongest impact of hydrological tiling occured in semiarid areas. HRU tiling efforts are underway in other LSMs, for example, JULES (https://www.ceh.ac.uk/ hydrojules). Fan et al. (2019) further note that as well as lateral drainage from hills to valley, slope aspect (the difference between sunny and shady slopes) is another first-order control on water and energy availability across the landscape.
A largely orthogonal set of developments pertains to representing the basic principles of community ecology, wherein the primary axis of variation in productive natural ecosystems is the patch mosaic generated by secondary succession: the processes of ongoing vegetation mortality and disturbance, gap formation, and the recovery of vegetation back to a closed-canopy state. Once again, many processes, in particular recruitment of young plants, are nonlinear with respect to the surface light environment (which itself is also highly nonlinear with respect to canopy shading). An absence of gaps in "big leaf" ecosystem models leads to an inability of trees to regenerate where the grid cell average forest has a closed canopy, leading to substantial low-biomass biases in models where forest demography is not resolved (Moorcroft et al., 2001). Similarly, where natural systems are subject to ongoing disturbance from natural mortality, in any given grid cell, anthropogenic disturbance also gives rise to a range of ages of secondary forest where biomass recovery also proceeds in a nonlinear fashion after abandonment. Shevliakova et al. (2009) andNabel et al. (2019) illustrate the importance of capturing the regrowth after disturbance in anthropogenically disturbed ecosystems for simulating the terrestrial carbon sink. At larger scales of soil heterogeneity, some models also implement tiling regimes for soil type. The ED2 land surface model (Longo et al., 2019), for example, divides the surface into components of different soil texture (sand & clay fractions). Later versions of the same model also implement tiling for soil nutrients but specifically allied to variation in disturbance history (Trugman et al., 2016).
This makes at least seven (snow depth, hydrological regime, aspect, rainfall distribution, soil texture, soil fertility, and time since disturbance plus time since land abandonment) relatively strong arguments for additional dimensionalities of subgrid-scale horizontal heterogeneity within land surface models. In addition to which, land surface processes outside of the "natural vegetation" type have been disaggregated within specific land use classes, into new crop types (including greater varieties of plant, plus the degree to which those crops are fertilized or irrigated; D. M. Lawrence et al., 2019) and subcategories of urban environments (roofs, sunlight and shaded walls, and impervious and pervious ground (Oleson et al., 2008). The majority of such new dimensions of tiling are typically proposed in isolation, but clearly, when considered collectively, it is not computationally tractable to divide up each climatic grid cell along all proposed dimensions. What is missing is an overall strategy via which one might discern the most important axes of variation for a given time and place. Capturing multiple simultaneous elements of landscape heterogeneity (Figure 3) must surely be a feature of such a strategy.
In principle, one might reduce the dimensionality of the land surface by means of representing covariance between different elements of heterogeneity (e.g., between hydrological regime, snow cover, and soil type). Newman et al. (2014), for example, present a k means clustering approach to defining a predetermined tiling scheme for a specific location, generating a set of 10 tiles that capture the dominant multifactoral regimes affecting land surface dynamics within a given tile. Identification of functionally similar units is an intuitively appealing approach to reducing the dimensionality of the multifactoral tiling regime, but of course rests on the nature of the questions one will ask of the model, for example, whether those are weighted toward hydrological, biogeochemical or ecological questions. Further, a priori determination of physical covariances assumes that the important axes of tiling are fixed in time and space.

Adaptive Tiling Strategies
While some axes of land surface variation (aspect, altitude, etc.) are indeed fixed on the timescales (tens to hundreds of years) under consideration, many of the given reasons for subgrid tiling are by definition dynamic in time and space. Thus, the degree to which tiling is needed along a particular axis varies. By way of illustration, within the Ecosystem Demography model (Moorcroft et al., 2001) the degree of discretization of the landscape along the disturbance-recovery axis is responsive to the current need for the model to distinguish ecosystems of varying size-structure. New tiles (or patches, in ED terminology) are formed when a disturbance event occurs. Subsequently, patches with ecosystem structure that are considered "sufficiently similar" (a user-defined characteristic) are fused and become a single model unit, with the physical and biological characteristics recalculated in the process. In practice, this means that large parts of the world with low productivity are not tiled for disturbance at all, saving significant computational time in the processes.
It is possible to imagine that areas impacted in a transient fashion by snow, rainfall, large gradients in soil moisture, and so forth, might be amenable to an "adaptive tiling" approach. Difficulties with generalizing this concept exist, pertaining in particular to the nontrivial complexity of merging and splitting highly complex model objects, with possibly different timescales of persistence in land surface heterogeneity. Lawrence et al. (2019) document the introduction of "dynamic" land unit transitions, which also allow fusion and lumping of, for example, physical and biogeochemical soil states. Limited to a smaller number of specifically transient dimensions, an extension of the ED approach to adaptive temporally variant tile resolution across multiple dimensions of heterogeneity (e.g., snow, surface moisture) appears at least theoretically plausible (Figure 4). This approach could allow the needs of multiple modeling communities to be met simultaneously, without expanding computational cost excessively. Such a scheme could operate within the context of subgrid tiling based on temporally invariant (e.g., topographic) landscape features. Numerous modeling groups are implementing both hydrological response unit tiling (Hazenberg et al., 2015;Subin et al., 2014;Swenson et al., 2019) and also vegetation demographics with disturbance tiling Longo et al., 2019;Weng et al., 2017). Therefore, the most likely near-term pathway for the representation of subgrid horizontal heterogeneity is the nesting of vegetation demographics (VDM) models (with time-varying adaptive tiling schemes) inside hydrological response unit tiles (which typically are determined from topography, and thus fixed in time). This methodology should allow for the prediction of, say, the responses of vegetation to variation in water stress across landscapes, the variation in drought mortality risk with differential access to water tables, and more generally allow a closer linkage between hydrological environments and vegetation quantities, which should in principle lead to more accurate responses to future change. It is possible to envisage further refinement of these architectures, both to expand the adaptive elements within each HRU, as well as refinement of how the fixed tiles represent covariance structures of other time-invariant structures such as altitude, aspect and soil fertility. The simultaneous operation of HRU and VDM schemes represents a substantial increase in the complexity and cost of the representation of the land surface, and thus it is imperative that they are implemented in ways that are flexible enough that they can either be turned off, and/or that the degree of disaggregation can be modified in accordance with the nature of the research question. This capacity, to probe the response of the model to alternative degrees of tiling in ecological and hydrological domains is a highly novel tool that should both provide more tangible answers to outstanding questions of tiling strategy, and provide a forum for greater collaboration across ecological and hydrological domains (e.g., Tague and Dugger (2010)).

Patch Length Scale and Adjacency
Discretization of the land surface along any particular vector leads to (and indeed, is motivated by) a separation of state variables into categories which evolve according to variable input and output fluxes. In reality, however, some diffusion of various quantities (energy, water, nutrients) between tiles existing in different states is likely, reducing the heterogeneity of the system. The rate of diffusion is dependent on the length scale (or adjacency) of different elements within the heterogeneous real-world landscape matrix. Given, however, that the tiling systems in LSMs are nearly always spatially implicit, and that each "tile" Conditions under which forest structure, snow water equivalent, and surface moisture (respectively) are sufficiently heterogeneous to merit separation into independent tile units. (d, e, and f) Conditions where heterogeneity in these features is low and would not require resolution. Panels (a) and (d) represent the mechanisms already present in ecosystem demography-type models, whereby new tiles are generated for each disturbance event and are then fused back into preexisting tiles if biomass structure is not sufficiently different to merit resolution. Thus, the model adapts to the complexity of the landscape and does not generate tiles where vegetation stature is low. "Event-based" tile generation and fusion could thus also form the basis of representing time-varying hydrological dynamics with new tile generated during snow and rainfall events, becoming homogenized with melting and/or drying. Other aspects of tiling that are not dynamic on the timescales in question (topography and aspect) would still require resolution at a higher level.
represents an aggregation of a set of different elements of a complex spatial mosaic, the rate of diffusion of quantities between tiles is difficult to ascertain. Typically, diffusion is either assumed to lead to complete homogeneity, where no tiling exists, or impossible, where it does. Models that capture a degree of diffusion between tiles would in principle need to be informed of the relevant length scale of their subtile elements (Jupp & Twiss, 2006). As above, the length scale of time-invariant features of the landscape might be distinguishable from remote sensing (topography, land use history, river and road fragmentation, likely requiring machine learning methods to identify such features globally), while other time-varying features might better be retained from the inferred or simulated size distribution of disturbance events. For example, Koster et al. (2019) illustrate the utility of soil moisture retrievals for informing the length scale of surface moisture following rainfall events, illustrating its dependence on the type of rainfall (convective vs. large-scale precipitation).
Relatedly, some phenomena (fire, insects) intrinsically "spread" through the landscape via contagion, a process which is difficult to model explicitly at the level of LSM grid cells. McCabe and Dietze (2019) propose a method for estimating the size distribution of contagious disturbance events based on their disturbance, initiation and spread probabilities as well as retaining through the simulation a metric of the "adjacency" of tiled elements within grid cells. Their method evolves the spatial adjacency of disturbed patches through time, and therefore could be generally applicable to the problem of retaining length-scale information for time-varying quantities. An estimate of the initial adjacency (presumably including time-invariant elements of landscape patchiness) is required, again, from analysis of remote sensing data. The definition of a "patch" for the purposes of calculating adjacency is, however, dependent on the target processes of interest.
McCabe and Dietze (2019) further argue that the inclusion of the concept of adjacency (and its dynamics) would in principle allow for a myriad of additional ecological phenomena to be captured, including edge effects on forest microclimate (of particular importance for the spread of fires), the dependence of dispersal limitation on spatial arrangement of forests, simulation of invasive species dynamics, and also as above the flow of matter and energy between patches. Thus, the extraction and use of both tiling units and their bulk spatial relationships might also be elements of the "grand challenge" of representing the heterogeneity of the land surface and the living systems that exist within and upon it. Clark, Nijssen, et al. (2015) note that vertical stratification is much more refined in models that focus more on vegetation physiology than in models that focus on the hydrological cycle. While early land surface models were built around a one-dimensional representation of the terrestrial surface to correspond to a single grid cell of an atmospheric model, they soon expanded to resolve vertical gradients in soil moisture and temperature to better capture surface energy fluxes and the representation of plant water access. Resolving vertical gradients in soil biogeochemistry, for example, is essential for systems such as permafrost-affected soils, where steep gradients in the soil physical climate mean that carbon cycles very differently at depth than at the surface (Koven et al., 2015). Slater et al. (2017) also note the improved performance of models with vertical profiles of temperature within the snowpack (Chadburn et al., 2015;van Kampenhout et al., 2017).

Other Dimensions of Heterogeneity
Vertical gradients in light, water status, temperature, leaf properties, and atmospheric conditions within the vegetative canopy are typically not resolved in most mainstream land surface models. The "two-stream" approximation of Sellers (1985), provides a closed-form solution for the scattering of direct and diffuse light through homogenous vegetative canopies, thus collapsing the vertical structure down to one or two (e.g., sunlit vs. shaded leaves) states. On account of its computational parsimony, this approach was widely adopted as standard in LSMs, precluding vertical representation of other quantities. In recent years, however, the gradual inclusion of increasing vertical detail has been an ongoing feature of land surface model development, enabling more robust comparisons with field data, which by definition are made on particular canopy layers. Implementations of the vertical structure of light absorption by leaves (Fisher et al., 2010;Mercado et al., 2007), for example, provides the capacity to further vary plant physiological traits with canopy depth, allowing models to represent the observation that plant traits do not in fact appear to scale consistently with light availability as assumed by the two-stream model (Lloyd et al., 2010;Meir et al., 2002). The further introduction of vertical variation in leaf water potential, within the context of plant hydrodynamic models, gives rise to the possibility of testing plant water status against field observations of leaf water potential, stem water potential and sap flow, which differ substantially with canopy height and irradiance (Christoffersen et al., 2016;Fisher et al., 2006;Joetzjer et al., 2018;Matheny et al., 2017;Mirfenderesgi et al., 2016;. A full treatment of the vertical structure of vegetative canopies, however, requires resolution of how light, temperature, CO 2 and water content/humidity vary throughout the leaf layers and canopy space. In forest ecosystems, in particular, large within-canopy gradients generate substantial environmental heterogeneity, which, as well as modulating gross canopy fluxes, is potentially an important driver of niche separation and the capacity to represent functional diversity. A few land surface models have recently implemented vertical gradients of irradiance, water content, leaf temperature, and also the feedback between the evaporation of water into canopy air space and the humidity of the airspace, modulating by turbulence processes within the canopy and the roughness sublayer (which extends to roughly twice the height of the canopy) Chen et al., 2016;Longo et al., 2019). These efforts represent the cutting edge of physical representation of forest-atmospheric exchange, and further challenge traditional assumptions about the distinction between the atmospheric and the planetary surface, boundary layer, as they bring the calculation of atmospheric mixing processes well into the realm traditionally occupied by LSMs, once again raising issues related to the management of model complexity discussed earlier.
A significant intersection between the resolution of heterogeneity and the prior challenge of complexity management is that agent-based models typically require a representation of the relevant gradients of heterogeneity that are appropriate to the scale of the agents being modeled. This may apply equally to microbially explicit soil biogeochemical models as to the cohort-based vegetation models discussed above. In principle, complex rhizosphere gradients in nutrient density radiating away from root surfaces are required for appropriate simulation of microbial communities Wieder et al., 2015), and nonlinear dynamics of soil-root resistance, a principal bottleneck on transpiration (Fisher et al., 2007;Sperry et al., 1998;Williams et al., 2001). Similarly, soil physical heterogeneity, from mineralogical gradients at micron scales to soil structural gradients at centimeter scales, may be crucial for governing both biogeochemical gradients that govern soil microbial ecology, as well as macropore flows that determine large-scale hydrologic functioning. As model process representation shifts toward representing the agents responsible for ecosystem function, rather than the aggregate behavior of ecosystem function, the need to match scales of process with resolved heterogeneity represents one of the more complex edges of model structural variation.
Reflecting our arguments in the previous section, coherent strategies to define the boundaries between interacting complex systems will be necessary to allow informed and useful deployment of models with this level of complexity in tandem with increasingly refined depictions of the horizontal domains included in LSM. As the dimensionality of LSMs increases, it will be imperative to build models that are sufficiently flexible that we can assess how resolving various gradients matters in the full system.

Challenge: Projecting the Temporal and Spatial Dynamics of Model Parameters
Land surface models tend to have a large number of parameters. Hourdin et al. (2017) argue that atmospheric models, are in general "founded on well understood physics combined with a number of heuristic process representations." LSMs, in contrast, combine numerous physical processes (themselves often dependent on the complex heterogeneity of the surface, as previously described) with large numbers of biological processes that in principle operate at a molecular level and are thus not practical to represent at their native scale. These processes are encapsulated as parameters, often formulated at the scale of relevant observations (e.g., individual leaves or trees). These parameters contribute to model uncertainty in a few different ways, which we describe below, as a set of distinct problems of parametric uncertainty which we refer to here as "parametric dynamics." Ideally, process-based models should use input parameter values that represent properties of the system that are static in time and space (Hourdin et al., 2017). For a plant trait or ecosystem property that is observed to vary in time and space, choosing whether the model parameters should represent either the mean value of that trait, parameters of observed relationship of the trait with the environment, parameters of a model that optimizes that trait with respect to the environment, or a whole range of parameters representing alternative types of plant that can be selected for or against according to the environment, is an open question. Thus, idealized discussion of what aspects of ecosystems can and cannot considered "parameters" remain largely out of scope for LSMs.

Parametric Uncertainty and Fitting
The first problem of parametric uncertainty is the simplest: How do we choose a set of parameters that gives a high agreement between model predictions and a wide suite of data sets? While simple to state, the large number of degrees of freedom makes this problem difficult to solve in practice. Numerous efforts have been made to optimize parameters in land surface models, using a variety of Bayesian approaches with priors coming from plant trait or other data (LeBauer et al., 2013), and based on optimizing to fit many different data sets, including optimizing hydrologic models against stream data (K. Beven & Binley, 1992), fitting gas exchange parameters to eddy covariance data (Mäkelä et al., 2019;Post et al., 2017) or using emulators (Fer et al., 2018;Sargsyan et al., 2014) or adjoints to full land surface models (Verbeeck et al., 2011) to optimize against eddy covariance observations. However, because of the high dimensionality of parameters, such efforts typically run into the barrier of equifinality: Running a model with many different sets of parameters can lead to equally good fit to data, and these equally good models may lead to widely divergent results under novel conditions (Tang & Zhuang, 2008).
One possible solution to this is to optimize models parameters against multiple types of data simultaneously, to allow separation by processes acting on different timescales or on different aspects of model predictions, as was done by MacBean et al. (2016). Extending such approaches to cover the large set of processes and parameters relevant to LSM predictions is itself an enormous challenge. Further, the direct assimilation of data for calibration (Kaminski et al., 2013) also leads to philosophical questions related to the interpretation of benchmarking and performance metrics (Collier et al., 2018;, which the LSM community is yet to confront systematically. A primary issue with LSMs is that biases in one part of a complex coupled model can undermine effective calibration of other components. In principle, embracing a more comprehensive modularity framework (section 2 and Figure 2) might allow for some individual processes to be calibrated in isolation with boundary conditions prescribed from observations or data products (Kemp et al., 2014). Many existing calibration studies have implicitly used low-complexity versions of carbon cycle models, for example, Bloom et al. (2016). Extension of this concept might facilitate the necessary dimensional reductions required to make this problem more tractable.

The Challenge of Living Systems: Predicting Changes in Ecosystem Properties
Beyond the problem of parameter optimization lies a deeper challenge: Many of the key canopy-scale properties of the land surface are determined by the traits of plants or other organisms (Kattge et al., 2020), which may vary enormously in their functional diversity across otherwise relatively homogeneous patches of ground. Because plants are constantly growing, dying, reproducing, and competing for resources, these compositional mosaics are also dynamic in time. Thus, under the large changes to the environment currently underway, we expect complex responses in the plant community composition at any given location thatin addition to constituting a major class of ecological impacts to be understood in their own right-determine the distribution of plant traits (as defined at a canopy scale) and the dynamics of the land surface.
Thus, we must decide upon methods to predict how plant function at the community level is likely to shift in response to global change. Approaches to this problem can be roughly grouped into three types: correlative, optimizing, and competitive ( Figure 5). Correlative approaches take empirically observed relationships between environmental variables and trait values, and assert that such relationships are conserved under global change. Many different variants on this argument exist. Early dynamic vegetation models (Sitch et al., 2003;Woodward & Lomas, 2004) used a discrete PFT version of this logic, where PFT distributions are constrained by bioclimatic indices, and each PFT defined a set of traits in a land surface model; thus when climate changed, the PFT coverage changed with it, which in turn changed the parameters representing plant processes of the model at a given grid cell. More modern versions of this approach isolate specific traits that are clearly observed to vary as a function of environment within the lifetime of an individual plant, and allow these to vary in time and space and a function of environmental conditions. Examples include the thermal acclimation of leaf photosynthetic and respiratory temperature sensitivities (Atkin et al., 2015;Kumarathunge et al., 2019;Lombardozzi et al., 2015;Slot et al., 2014), models that define allocation patterns (Thornton et al., 2007), N fixation (Thornton et al., 2007), and stem mortality rates (Delbart et al., 2010) all as functions of NPP, or more general relationships between plant traits and climate as inferred across multiple traits (Butler et al., 2017;van Bodegom et al., 2014;Verheijen et al., 2015).
Optimizing approaches work by, in principle, constraining predictions of plant trait values with the principle that evolution and competitive dynamics should have selected trait values that confer the highest "fitness" in a given environment. Thus, one can hypothesize that these optimal values are those most likely to be present. The crucial requirement for such approaches is to be able to define a functional relationship of costs and benefits (or fitness criteria) for a given trait value as conditional on the environment, which can then be optimized. Like correlative approaches, optimizing approaches make an assumption of rapid adjustment to environmental variation, and thus may be only strictly valid for traits that can be shown to vary over the lifetime of an individual plant. Examples of the expanding literature on plant optimality theory include the prediction of leaf nitrogen allocation to colimitation metabolic processes under varying environmental conditions (Smith et al., 2019;C. Xu et al., 2012), the response of canopy nitrogen to CO 2 fertilization (Franklin, 2007;Franklin et al., 2009), control of stomata to maximize assimilation while avoiding dessication (Bonan et al., 2014;Eller et al., 2018;Kennedy et al., 2019;Medlyn et al., 2011;Williams et al., 1996;Wolf et al., 2016;. Wang et al. (2017) predict internal leaf CO 2 balance based on a model that optimizes assimilation while accounting for the costs of water transport and nutrient uptake, and Street et al. (2012) show that N profiles in arctic canopies are consistent with optimal allocation theory controlled by diffuse light profiles. All optimality approaches rest on the determination of a proxy of fitness that should be maximized (which is uncertain, per Caldararu et al. (2019)), the definition of a timescale over which the optimization is relevant, and an assumption concerning the physiological limits of optimization and the timescales within which it can be achieved. Optimization is, for different purposes assumed to occur at scales from the lifetime of a single plant, to the timescale of adaptation of a whole ecosystem to its prevailing climate. The capacity for whole ecosystems to rapidly change functionality under a changing climate (particularly those dominated by long lived trees) may be slowed by the rate of demographic change and migration of better adapted individuals into the system, and therefore optimality approaches should perhaps be viewed as the likely equilibrium state of a system (with the caveat that the optimal strategy for individuals does not necessarily represent the evolutionary stable strategy within a competitive framework (Dybzinski et al., 2011).
Competitive approaches more directly address the need to simulate the demographics of transient ecosystem states. Instead of optimizing a specific function, competitive approaches attempt to resolve the population dynamics of individual agents, competing for resources in the context of the environment. Thus, the population dynamics themselves act to find optimal values among a set of possible trait values from the pool of competing types. The dynamics of competition may range from Lotka-Volterra-type formulations with the competing agents being canopies comprised of plants from a given PFT (Cox, 2001;Harper et al., 2016) to demographic or individual based models where the agents are either cohorts of size-resolved plants or individual plants competing for space in the canopy and other resources (Christoffersen et al., 2016;Moorcroft et al., 2001;Purves et al., 2008;Sakschewski et al., 2015 ;Scheiter et al., 2013). Advantages of this approach are that it can in principle capture the timescales of community adjustment to global change, as well as that it does not require an a priori estimation of a fitness function to be optimized, and thus may be applicable to a wider range of traits, or interactions between traits, than optimizing or correlative approaches. A significant challenge of this methodology is the maintenance of functional diversity, particularly in models that specifically are not inclusive of many mechanisms known to stabilize competitive exclusion processes between plants with differential fitness (Chesson, 2000;Gravel et al., 2011), as well as spatial dispersal processes, high dimensional resource partitioning and the density-dependent impact of "natural enemies" on demographic stability. Investigation of the maintenance of functional diversity in demographic vegetation models is thus an emerging field of research in this domain (Falster et al., 2017;Koven et al., 2019;Maréchaux & Chave, 2017;Powell et al., 2018).
A further challenge in competitive approaches is to better understand how the definition of PFTs in a given model relates to model predictions. As model complexity has grown, such definitions have grown more complex, from early approaches that equated PFTs with biomes to newer approaches that define PFTs via multiple axes of trait variation. Key challenges relate to the specific ways in which continuous and multidimensional trait variation is discretized into the axes of trait coordination that define PFTs, including (1) the number of axes needed to distinguish a comprehensive set of PFTs needed to solve a given problem, (2) how these axes are specified from trait observations while taking into account both represented and unrepresented tradeoffs that may prevent dominance by any one PFT (Sakschewski et al., 2015;Scheiter et al., 2013), and (3) how finely should a set of possible PFTs resolve any given axis of trait variation?
We outline three alternative, but not necessarily competing, philosophies for addressing the dynamics of organism traits-and thus ecosystem properties-in time and space. In principle, all of these approaches (correlative, optimizing, and competitive) may be combined in a given land surface model, but theory for how to do so is not well developed. For any given trait, the inclusion of a high degree of plasticity through either correlative or optimizing approaches would reduce the role that such a trait plays in determining competitive outcomes. In principle, to best reflect reality, observed within-lifetime plastic responses to climate could in principle be nested within competitive demographic approaches for projecting distributions of traits where no such individual-level plasticity is evident.
To capture trait dynamics on timescales of many generations (or to take the optimisation of plant fitness in the presence of competition properly into account) would require demographic models to be embedded within representations of trait evolution (including mutation and selection), per (Falster et al., 2017;Scheiter et al., 2013). This consideration, combined with the need to enhance coexistence of functional types within competitive models, suggests a specific need to open a greater dialog with other formerly separate disciplines in ecology. The field of biodiversity and ecosystem function is also motivated, for example, by improving understanding of the means by which functional variations within extant communities of species may or may not confer resistance and/or resilience to climate shifts (Hooper et al., 2005;Isbell et al., 2015;Turnbull et al., 2013;Yachi & Loreau, 1999). Interactions between coexistence theorists and land surface modelers are rare, a situation which we hope improves as our ecological tools at the intersection of these fields mature.

Further Challenges in Land Surface Model Science
In this discussion of "grand challenges" we have focused on several higher-order elements of LSM development: complexity management, surface heterogeneity, and parametric dynamics. There remain numerous other aspects of LSM science where substantial progress is necessary, but for which the overall solutions are perhaps more apparent within contemporary organizational structures. For example, development of the scientific collaboration and software infrastructure to conduct comprehensive and rapid model benchmarking will continue to be a major priority of the community, with particular reference to the adoption of community tools to avoid duplication of effort (Abramowitz et al., 2008;Best et al., 2015;Collier et al., 2018;Nearing et al., 2018) (and as illustrated at, e.g., www.ilamb.org), and these efforts will need to be extended to encompass data products and outputs relevant to the emerging capacities of LSMs (hillslopes, vegetation demographics, etc.).
Relatedly, a major focus is required to generate and apply data products that can be used within land surface model development, from the ever-expanding scope of Earth observation remote sensing activities and other large data sets and data synthesis activities. This is an especially wide and active field in which clearly a great quantity of effort is already expended (Duncanson et al., 2019;Houborg et al., 2015;Quegan et al., 2019;Stavros et al., 2017) and the depth and breadth of new satellite observations of, for example, biomass and canopy structure, carbon dioxide, multispectral and hyperspectral surface reflectance, chlorophyll fluorescence, emissivity, and water content will likely revolutionize our knowledge of the terrestrial biosphere and our capacity for predictive understanding. The development of data products (including those scaled from site level observations to gridded products, e.g., Beer et al., 2010) should bear in mind, in particular, the likely future trajectory of land surface model developments in future years. Increasing process resolution of LSMs (along the dimensions discussed above) should allow significant improvement in the capacity for Earth observations of the real world to be directly comparable with model states and fluxes, and both activities should be designed to leverage this potential, in particular, by prioritizing the availability of data products more closely related to the raw signals than to products aggregated by default to the same degree as older land surface schemes.
As LSMs have matured to provide more detailed representations of the land surface, another key development has been to follow the lead of atmospheric models by providing short-term forecast cycles for aspects such as hydrologic prediction (NOAA, 2016). Relatedly, in the context of short-term forecasting, numerous "land data assimilation systems" (LDAS) have been implemented in the last two decades, as reviewed by (Xia et al., 2019). The focus of these efforts is typically on improving the system state for the purposes of better short-to medium-term predictability. Such efforts are useful for identifying where LSMs do and do not have predictive skill, but with some exceptions (Fox et al., 2018;Kaminski et al., 2013;Peylin et al., 2016) efforts are not yet particularly well integrated into climate-focused land surface modeling activities. To some extent, short-term weather forecasting operations are concerned only with a subset of the problems faced by climate-oriented modeling activities. Integration of observed leaf area index, snow cover and surface soil moisture, for example, overrides many of the higher-order predictive processes in a complex land surface scheme. The emergence of the concept of ecological forecasting  however, aims to probe and illustrate the degree to which the concepts of data assimilation can help constrain predictions dependent on accurate representation of ecosystem processes (Fer et al., 2020;Niu et al., 2014).
More practical areas of concern relate to the availability of sufficient computing resources, and to the technical challenges of implementing modularized code structures and adaptive tiling schemes. Meeting these challenges-via access to supercomputer infrastructure, and critically via the entrainment of modern professional software engineering-is intrinsically linked to the need to strengthen funding infrastructures for land surface modeling activities. LSMs have most typically developed associated with and adjunct to atmospheric modeling activities. Simultaneously, the scope of LSMs has expanded such that their applications rest firmly within the domains of hydrology, ecology, geography and biogeochemistry. This situation is challenging for the majority of national-or agency-level funding networks, particularly those where funding streams are aligned with more traditional academic disciplines. Strengthening the connection between the LSM community and the disciplines on which the evolving model capacity both encroaches and depends is the most likely means by which changing the status quo can be achieved.

Conclusions
Global concern is more deeply focusing on the fate of the terrestrial biosphere and the land surface, as we accelerate toward rapid changes in climate, atmospheric composition, and land use. With this increased focus, studies using land surface models regularly make international headlines. LSMs are the primary tools that we have to simulate conditions for life on the terrestrial surface of planet Earth and play a crucial role in our ability to estimate the quantity of carbon that humanity can, in principle, emit to limit climate change to any given international target (e.g., 1.5 and 2°C).
Despite this extraordinary degree of interest, the number of individual scientists and software engineers actively developing LSMs could comfortably all be housed in one medium-sized village, and even the most active LSM teams struggle to meet multifaceted demands placed upon them. These include predictions of the terrestrial cycling of carbon, water, energy, nitrogen, methane, and N 2 O, in the context of changing climate, atmospheric CO 2 , ozone, and N deposition, as well as vegetation cover, land use, fire, and crop management,.
We argue that new paradigms in complexity management, in the flexible representation of surface heterogeneity, and in the representation of parametric and trait dynamics, are needed to meet the overwhelming challenges that are necessarily imposed upon our community by the questions of society. To modify existing development practices to encompass the modular complexity and adaptive heterogeneity frameworks, we suggest is a major scientific and software engineering challenge. We emphasize that modern collective, open approaches to code development, benchmarking, computational methods, data product development, and publication are necessary to facilitate these paradigm shifts. Further, modularization of model code and the development of international teams of experts collaborating on advanced process representation of distinct model elements would have considerable benefits in terms of reduced duplication of effort, while architectures built on modular complexity approaches may allow differing institutional interests to be represented via alternate structural configurations and parametric choices within a given model. Significant effort is required to meet these urgent needs. The status quo investment in land surface model development is inadequate for the task at hand, and it will not suffice if we seek an ability to make robust projections of the status of the terrestrial land surface and the living systems which inhabit it over decadal to century timescales.
Modern LSMs represent a unique and powerful intersection of the fields of physics, biochemistry, physiology, ecology, hydrology, geography, statistics, mathematics, and high-performance computing. To solve our grand challenges, we must raise the profile and importance of LSMs within all these contributing fields. Given the overwhelming importance of understanding how our modification of Earth's atmosphere and climate will affect our direct living conditions and the ecological and hydrological systems on which we depend, it is imperative that LSMs step out of the shadow of their "atmospheric boundary condition" beginnings and develop into a science in their own right.