Ocean Biogeochemistry in GFDL's Earth System Model 4.1 and Its Response to Increasing Atmospheric CO2

This contribution describes the ocean biogeochemical component of the Geophysical Fluid Dynamics Laboratory's Earth System Model 4.1 (GFDL‐ESM4.1), assesses GFDL‐ESM4.1's capacity to capture observed ocean biogeochemical patterns, and documents its response to increasing atmospheric CO2. Notable differences relative to the previous generation of GFDL ESM's include enhanced resolution of plankton food web dynamics, refined particle remineralization, and a larger number of exchanges of nutrients across Earth system components. During model spin‐up, the carbon drift rapidly fell below the 10 Pg C per century equilibration criterion established by the Coupled Climate‐Carbon Cycle Model Intercomparison Project (C4MIP). Simulations robustly captured large‐scale observed nutrient distributions, plankton dynamics, and characteristics of the biological pump. The model overexpressed phosphate limitation and open ocean hypoxia in some areas but still yielded realistic surface and deep carbon system properties, including cumulative carbon uptake since preindustrial times and over the last decades that is consistent with observation‐based estimates. The model's response to the direct and radiative effects of a 200% atmospheric CO2 increase from preindustrial conditions (i.e., years 101–120 of a 1% CO2 yr−1 simulation) included (a) a weakened, shoaling organic carbon pump leading to a 38% reduction in the sinking flux at 2,000 m; (b) a two‐thirds reduction in the calcium carbonate pump that nonetheless generated only weak calcite compensation on century time‐scales; and, in contrast to previous GFDL ESMs, (c) a moderate reduction in global net primary production that was amplified at higher trophic levels. We conclude with a discussion of model limitations and priority developments.


Introduction
This contribution describes and evaluates the ocean biogeochemical component of simulations with the Geophysical Fluid Dynamics Laboratory's Earth System Model 4.1 (GFDL-ESM4.1) that were contributed to the 6th Coupled Model Intercomparison Project (CMIP6; Eyring et al., 2016). While CMIP6 supports a broad array of scientific activities, its emphasis on understanding past, present, and future climate has led to numerous contributions to the assessment reports of the Intergovernmental Panel on Climate Change (IPCC) that have shaped global climate policies. The scientific focus of CMIP6, and its role within the IPCC, puts two facets of marine biogeochemical dynamics into sharp focus. The first is the role that ocean biogeochemistry plays in modulating the oceanic uptake of anthropogenic CO 2 . The ocean has absorbed nearly 30% of CO 2 emitted from fossil fuel burning, cement production, and land use changes (Gruber et al., 2019;Sabine et al., 2004). This uptake is shaped by the "biological pump," whereby the sinking of biologically generated carbon from the ocean's surface and its subsequent remineralization at depth promotes the uptake of atmospheric CO 2 (e.g., Sarmiento & Gruber, 2013). The uncertain evolution of the biological pump under climate change contributes to the spread of future atmospheric CO 2 trajectories (Friedlingstein et al., 2014;Kwon et al., 2009). A realistic depiction of the biological pump, the uptake of atmospheric CO 2 , and the sensitivity of these processes to climate change are thus high priorities for the ocean biogeochemical dynamics within GFDL-ESM4.1.
A second aspect of marine biogeochemical dynamics brought into sharp focus by climate change is the potential for shifting ocean conditions to affect living marine resources (Doney et al., 2011;Pörtner et al., 2014). Warming oceans have been linked to shifting fish distributions (Pinsky et al., 2013) that are projected to continue (Cheung et al., 2009). Warming and projected oxygen declines may place strong constraints on fish habitats and fisheries (Deutsch et al., 2015). Acidification associated with oceanic CO 2 uptake negatively impacts coral reefs and shell-forming organisms, including many of great commercial value (Doney et al., 2009;Kroeker et al., 2013). Finally, increasing ocean stratification is projected to alter ocean productivity Laufkötter et al., 2015), creating potentially large regional shifts in fisheries production with implications for food security (Asch et al., 2019;Lotze et al., 2019;Stock et al., 2017). Anticipating such changes is crucial for formulating effective marine resource management strategies in a changing climate (Stock et al., 2011;Tommasi et al., 2017). A realistic depiction of the dynamics controlling these potential ecosystem stressors and their sensitivity to climate change is thus a second high priority for the ocean biogeochemical dynamics within GFDL-ESM4.1.
GFDL's ocean biogeochemical model development efforts reflect the carbon system and ecosystem priorities described above. The Tracers of Ocean Phytoplankton with Allometric Zooplankton (TOPAZ) biogeochemical model used in GFDL's previous ESMs (Dunne, John, et al., 2012;Dunne et al., 2013) was replaced with Version 2 of the Carbon, Ocean Biogeochemistry and Lower Trophics (COBALTv2) biogeochemical model for ESM4.1. The primary structural difference between these two models lies in COBALT's improved resolution of the plankton food web processes central to the biological pump and the flow of carbon to phytoplankton to fish. Other significant changes include refinement of particle remineralization dynamics and additional nutrient exchanges between the land, atmosphere, and ocean. Despite these advances, COBALTv2 remains a simplified depiction of ocean biogeochemical processes, many of which are subject to significant uncertainties Frölicher et al., 2016). Cheung et al. (2016), adapting the rationale for physical climate projections described by Randall et al. (2007), assert that confidence in biogeochemical-ecosystem projections rests upon (a) the reliance of biogeochemical-ecosystem models on principles expected to hold under the novel environments expected under climate change and (b) a model's capacity to simulate observed biogeochemical-ecosystem patterns. The objectives of this paper are thus to describe the formulation of the ocean biogeochemical dynamics within GFDL-ESM4.1, assess its capacity to capture observed patterns in processes critical to the priorities defined above, and document the basic biogeochemical responses to climate change. We address the first of these objectives in section 2, and the second and third in section 3. In section 4, we provide an overall model assessment and critically discuss the model strengths and weaknesses, with the latter providing priority development targets for the next generation of GFDL's ocean biogeochemical model.
change. The land model simulated land hydrology, carbon dynamics in plants and soils, and terrestrial effects on the earth's radiation balance. The land model settings used for CMIP6 did not include terrestrial nitrogen or phosphorus dynamics.
The ocean component, GFDL-OM4p5, used the recently developed Version 6 of the Modular Ocean Model (MOM6) and incorporated numerous advances relative to GFDL's last generation of ESMs . First, the horizontal ocean resolution was doubled to (½)°(~50 km). This resolution does not resolve mesoscale eddies, and GFDL-OM4p5 thus relied on parameterized eddy-induced mixing. The enhanced resolution did, however, improve representations of boundary currents and topographically controlled flows, contributing to improved fidelity with physical ocean observations Dunne et al., 2020). Second, a hybrid Z*-isopycnal vertical coordinate system based on an Arbitrary Lagrangian Eulerian (ALE) approach replaced Z and Z* coordinates used in previous MOM releases. This advance allowed a reduction in spurious diapycnal mixing in the ocean interior and associated spurious heat uptake and temperature drift (e.g., Griffies et al., 2000). Third, OM4p5 increased the number of vertical layers from 50 to 75, allowing much finer resolution near the ocean surface (~10 m for ESM2M,~2 m for ESM4.1). Finally, OM4p5 incorporated various refined parameterizations of subgrid scale and other phenomena, including a new energetically constrained parameterization of surface boundary layer dynamics (Reichl & Hallberg, 2018), mesoscale thickness mixing (Jansen et al., 2015), and submesoscale mixed layer restratification (Fox-Kemper et al., 2011).
The ocean biogeochemical component of GFDL-ESM4.1, COBALTv2, is an updated version of the Carbon, Ocean Biogeochemistry and Lower Trophics (COBALTv1) model described in Stock et al. (2014b). Like COBALTv1, COBALTv2 uses 33 tracers to represent the cycles of carbon, alkalinity, oxygen, nitrogen, phosphorus, iron, silica, calcium carbonate, and lithogenic minerals in the global ocean ( Figure 1). Elements are tracked using a combination of variable and fixed ratios for each plankton functional type and nonliving components (e.g., particulate and dissolved organic material). Section 2.2 and Appendix A provide a detailed model description. The stoichiometry of the core production and remineralization reactions are provided by Equations A1-A5.
As mentioned in the introduction, a primary structural difference between COBALTv2 and the ocean biogeochemical model used in the previous generation of GFDL's ESM (TOPAZ; Dunne et al., 2013) is an enhanced representation of plankton food web dynamics. This allows COBALTv2 to resolve nonlinear variations in the flow of energy from phytoplankton to fish Stock et al., 2014a) and additional capacity to resolve linkages between biogeochemical cycles and plankton food webs. Another key contrast is that the remineralization of sinking organic matter includes dependencies on mineral content (Armstrong et al., 2001;Klaas & Archer, 2002), oxygen and temperature (Laufkötter et al., 2017), rather than just mineral content and oxygen. Finally, the elemental cycles with COBALTv2 have additional dynamical linkages with the land and atmosphere (Table 1 and section 2.1.1). Notable additions include estimates of oxidized and reduced atmospheric N deposition, accounting for the bidirectional exchange of ammonia across the air-sea interface (Paulot et al., 2015, dust and iron supplies linked to terrestrial and atmospheric conditions (Evans et al., 2016;Shevliakova et al., 2020), and iron sources from icebergs (Laufkötter et al., 2018).
The most noteworthy simplification relative to previous GFDL ESMs is COBALTv2's reliance on fixed nitrogen to phosphorus ratios for each plankton function types. Efforts to incorporate recent advances in C:N:P stoichiometry (e.g., Martiny et al., 2013;Moreno & Martiny, 2018) into a mechanistic and computationally efficient global formulation within COBALTv2 are ongoing but had not advanced enough to allow incorporation into GFDL-ESM4.1 for CMIP6. The implications of this simplification are revisited in sections 3 and 4.
As described in Dunne et al. (2020), data from the World Ocean Atlas 2013 were used to initialize temperature, salinity, macronutrient (NO 3 , PO 4 , and SiO 4 ) and oxygen concentrations (Garcia et al., 2014a(Garcia et al., , 2014bLocarnini et al., 2013;Zweng et al., 2013). Carbon and alkalinity data from version 2 of the Global Ocean Data Analysis Project (GLODAPv2) were used to initialize dissolved inorganic carbon (DIC) and alkalinity (Lauvset et al., 2016). Other initial ocean biogeochemical fields were drawn from an earlier COBALTv1 simulation (Stock et al., 2014a). GFDL-ESM4.1 was spun up from this initial state for 400 yr. To prevent phosphorus drift, riverine phosphate concentrations were calibrated at year 210 and again at year 360 such that total phosphate delivery by rivers matched phosphate burial. Note that this calibration was not done for nitrogen, since the inclusion of nitrogen fixation and denitrification provides a means for the model to approach a natural equilibrium over the time-scales of the spin up (see section 3.1). Sediment calcite concentrations were set such that alkalinity lost via calcite burial matched river alkalinity inputs at year 360 using an offline calculation described in Dunne, Hales, and Toggweiler (2012). Riverine DIC concentrations were also calibrated to match calcite and organic carbon burial at year 360 of the spin up. Figure 1. A simplified schematic of COBALTv2 dynamics (see section 2.2 and Appendix A for details). The overlapping profiles on the left edge reflect characteristic distributions of dissolved inorganic nutrients/carbon (increasing with depth, NO 3 , NH x , Fe, PO 4 , SiO 4 , and DIC) and dissolved organic nutrients/carbon (decreasing with depth, SRDOM, SLDOM and LDOM = semirefractory, semilabile, and labile dissolved organic matter (DOM)). Small phytoplankton (SP), large phytoplankton (LP) and diazotrophs (Diazo) take up inorganic nutrients. The colored wedges correspond to carbon and nutrient pools associated with each group, with nitrogen providing the central currency. Some elements have fixed proportions with nitrogen, while others vary dynamically (section 2.2). Small, medium, and large zooplankton (SZ, MZ, and LZ) consume phytoplankton and other plankton groups. Implicit fish consume large zooplankton and are assumed to vary in proportion with zooplankton prey. Free-living bacteria (B) consume dissolved organic matter and are consumed by SZ. An implicit virus term also shunts a fraction of B and SP production back to DOM. Phytoplankton exudation and egestion of material by smaller zooplankton further supply DOM (blue arrows). Phytoplankton aggregation and fecal pellets, predominantly from larger zooplankton, generate sinking detritus. Sinking detritus is remineralized as a function of temperature, oxygen, and mineral ballast. Material reaching the sediments is remineralized, stored, or buried. Inorganic nutrients are resupplied via respiration and associated nutrient excretion. Oxygen and alkalinity dynamics are not shown in Figure 1 but are described in detail within section 2.2 and Appendix A.
The preindustrial control was spawned at the beginning of year 401. We consider a historical simulation  and an idealized CO 2 perturbation experiments (1% atmospheric CO 2 increase per year) spawned at year 101 of the preindustrial control.
Model skill assessment relies on years 145-164 of the concentration-forced historical simulation, corresponding to 1995-2014 CO 2 and aerosol levels. We chose the concentration forced runs for analyses herein to best characterize the performance of the ocean biogeochemical model under unbiased forcing (i.e., to prevent atmospheric CO 2 discrepancies potentially arising from other ESM components from influence the assessment of the ocean biogeochemical model). The behavior of the emissions-driven simulations are described in Dunne et al. (2020) and Arora et al. (2019).
Analysis of the response of ocean biogeochemical dynamics to accumulating atmospheric CO 2 relies on comparisons between the piControl, the historical, and a 1% CO 2 increase per year simulation. Cumulative ocean CO 2 uptake since 1850 was calculated based on the difference between the historical DIC inventory at the reference year for each observation-based study we compare against (1994for Sabine et al., 2004, 2002for Lauvset et al., 2016for Gruber et al., 2019and2008 for Khatiwala et al., 2009) and the corresponding year of the preindustrial control experiment. Following Bronselaer et al. (2017) and Le Quéré et al. (2018), we consider an adjustment for pre-1850 ocean carbon uptake and 1850 air-sea CO 2 disequilibrium. An additional~10-20 Pg C was considered for comparisons with estimates of cumulative uptake since 1791 (Lauvset et al., 2016;Sabine et al., 2004), and~10-30 Pg C comparisons with cumulative uptake since 1765 (Khatiwala et al., 2009). We compared years 101-120 of the 1% CO 2 per year experiment (2.7 to 3.3 times preindustrial CO 2 ) to the corresponding years of piControl to assess the Note. The designation "src/snk" indicates that the term can be either a source or sink. Elemental sources and sinks can come from other components of GFDL-ESM4.1 (e.g., dust deposition), external data sets (e.g., river nutrient loads from an external model), or operate within the ocean (e.g., nitrogen fixation). a Global NEWS refers to the Global Nutrient Export from WaterSheds model (Seitzinger et al., 2005). b Sulfate reduction is also tracked as an oxygen deficit (see supporting information).
response to high atmospheric CO 2 levels. This corresponds to atmospheric CO 2 levels between~760 and 920 ppm, similar to those projected in the latter third of the 21st century under moderate to high emissions scenarios (Meinshausen et al., 2019;van Vuuren et al., 2011).

Exchanges With the Atmosphere and Land
Two-way exchanges across the air-sea interface are resolved for CO 2 , O 2 , and ammonia. All of these exchanges have been updated and/or enhanced since COBALTv1 (Table 2). For CO 2 and O 2 , exchanges are now based on the relationships of Wanninkhof (2014) rather than those of Wanninkhof (1992). Associated carbon chemistry calculations are done with the "model the ocean carbonate system" (mocsy) routines, v2.0 (Orr & Epitalon, 2015), replacing the OCMIP protocols of Najjar and Orr (1998).
Nitrogen deposition comprises both reduced and oxidized forms. Wet and dry deposition of each component are calculated in AM4.1 and passed to COBALTv2. For ammonia, a two-way exchange is implemented following Paulot et al. (2015Paulot et al. ( , 2020. Lithogenic dust and associated iron is generated in LM4.1 as a function of wind speed, topography, soil water, ice and snow cover, vegetation cover, and land type (Evans et al., 2016;Shevliakova et al., 2020). Dust is assumed to have a 3.5% iron content, and iron solubility varies inversely with the dust concentration in accordance with Baker and Croot (2010). Soluble iron is a key limiter of phytoplankton growth (section 2.1.2), while lithogenic minerals combine with biogenic minerals resolved in COBALT to impact the remineralization depth of sinking particles (section 2.1.4; Armstrong et al., 2001;Dunne et al., 2007;Klaas & Archer, 2002).
Riverine nutrient fluxes were derived by combining estimated concentrations of dissolved inorganic and organic nitrogen (N) and phosphorus (P) values from the Global Nutrient Export from Watersheds (GlobalNEWS) project (Seitzinger et al., 2005) with freshwater flows from LM4.1. Concentrations reflected contemporary patterns, but were held constant for all ESM4.1 simulations. Particulate nutrient loads were assumed to be buried in estuarine and nearshore waters, and thus not directly considered in the simulations described herein. As described in section 2.1, P concentrations were then scaled to achieve an approximate global balance with P burial during the piControl simulation. This required a 65% increase in riverine PO 4  (Najjar & Orr, 1998) mocsy (Orr & Epitalon, 2015) Iron sources Climatological deposition from Moxim et al. (2011) with solubility from Fan et al. (2006); sediment source following Elrod et al. (2004) plus a "coastal" source of similar magnitude to sediments Dust sources/deposition from LM4.1; AM4.1 (Evans et al., 2016;Shevliakova et al., 2020) with solubility from Baker and Croot (2010); Icebergs and glacial iron sources from OM4p5 (Laufkötter et al., 2018); Sediment source from Dale et al. (2015); Geothermal source (Tagliabue et al., 2010(Tagliabue et al., , 2014 (Doney et al., 1996) Quadratic, but suppressed in good growth conditions (Waite et al., 1992, see Appendix A.2) concentrations (an additional 1 Tg P). Since phosphate loads are dominated by particulates (~9 Tg P), this upward calibration can be interpreted as a small fraction of particulate phosphate desorbing in estuaries (e.g., Froelich, 1988). Even with this additional phosphorus, river inputs remain enriched in N relative to P, with a mean molar N:P ratio of 29:1, well above the Redfield ratio of 16:1.
Iron concentrations in rivers were set to a constant characteristic value of 40 μmoles m −3 to give a total input of~1.5 × 10 9 moles yr −1 (De Baar & De Jong, 2001). Following Laufkötter et al. (2018), melt associated with glaciers and associated icebergs is given a characteristic soluble iron loading of 100 nanomoles per kg of ice melt, consistent with the magnitude suggested by Raiswell et al. (2008Raiswell et al. ( , 2016. A riverine lithogenic mineral source of 0.5 Pg yr −1 is also considered, based on an analysis of seafloor particle export (Dunne et al., 2007).
Riverine concentrations of carbon and alkalinity were calibrated to match carbon burial and net alkalinity losses to the sediment following Dunne, Hales, and Toggweiler (2012). This resulted in characteristic concentrations of 0.32 moles DIC m −3 and 0.42 mole equivalents of alkalinity m −3 , and a combined input of dissolved inorganic and organic carbon of 0.22 Pg C yr −1 . This is lower than observation-based estimates (~0.8 Pg C yr −1 ; Bauer et al., 2013;Resplandy et al., 2018) but is consistent with the assumption that a substantial fraction of particulate carbon is buried in nearshore regions poorly resolved by ESM4.1. Accounting for a net nearshore burial of 0.45 Pg C yr −1 and an uncertainty of 50% (Bauer et al., 2013) puts ESM4.1 on the low end of existing estimates of carbon inputs from riverine systems.
We note that ESM4.1 did not consider atmospheric or riverine silicon inputs or silicon burial. While the timescale of oceanic silica cycles is substantially longer than the timescale of our simulations (~10,000 yr, Tréguer & De La Rocha, 2013), we recognize that there have been considerable recent trends in coastal regions due to river damming and other anthropogenic influences with implications for carbon cycles and plankton communities (Laruelle et al., 2009;Turner & Rabalais, 1994). Plans to address this limitation, and other omitted exchanges between the land, atmosphere and ocean will be Discussed in section 4.

Nutrient and Phytoplankton Dynamics
As described in Stock et al. (2014b), phytoplankton dynamics in COBALTv2 are represented within a sizeand functional group-structured plankton food web (Figure 1, Appendix A.1). Phytoplankton are divided into small (SP), large (LP), and a trichodesmium-like diazotroph group (Diazo). The delineation between SP and LP is~10 μm in equivalent spherical diameter (ESD). Diatoms are a diagnosed fraction of LP based on silica limitation (Equations A13 and A14). COBALTv2 does not explicitly resolve calcite-generating coccolithophorids, opting instead to model net production of calcium carbonate detritus as a function of saturation state and associated detritus-generating processes (see section 2.1.4).
The formulation of Geider et al. (1997), which includes variations in the chlorophyll to carbon ratio in response to nutrient and light limitation, is used to model photosynthesis (Equations A6-A8 and  Table A2). Maximum photosynthetic rates are drawn from the upper bound of values reported by Geider et al. (1997) to be consistent with recent maximum growth rate compilations (Bissinger et al., 2008;Brush et al., 2002) after accounting for respiration. The increase in growth rates with temperature follows Eppley (1972). SP are assumed to have higher chlorophyll-specific light harvesting rates than LP due to their higher ratio of surface area to volume (i.e., a reduced "package effect" following Morel & Bricaud, 1981). Light attenuation with depth was modeled with the scheme of Manizza et al. (2005), which divides visible light into red and blue/green bands with different absorption coefficients and includes chlorophyll feedbacks.
Phytoplankton in COBALTv2 draw nutrients from inorganic nutrient pools only, which include nitrate (NO 3 ), ammonium (NH 4 ), phosphate (PO 4 ), dissolved inorganic iron (Fe), and silicate (SiO 4 ). In addition to the inhibition of NO 3 uptake in the presence NH 4 , COBALTv2 considers the potential for NH 4 uptake to be inhibited in the presence of high NO 3 using the formulation of O'Neill et al., (1989;Equations A9 and A10). Michaelis-Mentin kinetics were used to simulate uptake of other nutrients, with half-saturation constants 5 times smaller for SP than LP, reflecting the advantage of high surface area to volume ratios for nutrient uptake (Edwards et al., 2012;Gavis, 1976;Munk & Riley, 1952; Table A2). Diazotrophs take up iron and phosphate similarly to other phytoplankton, but obtain their nitrogen through a combination of N 2 -fixation and dissolved inorganic nitrogen uptake, if the latter is available (Holl & Montoya, 2005). Silica is only taken up by large phytoplankton, with both the fraction of large phytoplankton uptake associated with diatoms and the Si:N ratio of that uptake posed as a saturating function of the silica concentration (Equations A13 and A14).
Phytoplankton must compete for available iron with scavenging by detrital particles. COBALTv2 iron scavenging uses a single ligand model (Johnson et al., 1997). Light reduces the effectiveness of ligand binding (Fan, 2008) and scavenging increases sharply when unbound iron concentrations exceed solubility limits (Liu & Millero, 2002). Updates to the iron dynamics since COBALTv1 ( Table 2) include replacement of first-order scavenging dynamics with a dependence on the product of free iron and detritus (Equation A20), updated iron sources, accounting for free iron solubility, and the addition of ligands proportional to nonrefractory dissolved organic matter. The iron scavenging rate coefficient was calibrated to produce biome-scale variations in iron limitation that are consistent with observations , section 3.2.2).
The ratio of carbon to nitrogen (C:N) associated with nitrogen uptake was held at the Redfield ratio (106:16) and other nutrients within phytoplankton were tracked relative to N ( Figure 1). Phytoplankton were assigned characteristic N:P ratios by size and functional type (Finkel et al., 2009;Quigg et al., 2003): diazotrophs and SP were assumed to be nitrogen-rich relative to the 16:1 Redfield N:P ratio (40:1 and 20:1, respectively), while LP were assumed to be phosphate rich (12:1). These values differ from COBALTv1 (Table 2). Phytoplankton N:Fe ratios were allowed to vary dynamically according to Sunda and Huntsman (1995;Equation A12). Nutrient limitation was modeled with Liebig's Law of the Minimum (Liebig, 1840). Phytoplankton production creates oxygen with a C:O 2 ratio of 106:150 for NO 3 -based production (Anderson, 1995) and 106:118 for NH 4 -based production (Equations A1 and A2).
Photosynthetically fixed organic material is either exuded to labile dissolved organic material (Baines & Pace, 1991), consumed by zooplankton (section 2.1.3), or sinks as phytoplankton aggregates. Phytoplankton aggregation and sinking is modeled as a quadratic phytoplankton loss term (Doney et al., 1996;Jackson, 1990) to coarsely mimic the particle-particle interactions required for aggregation (i.e., LP loss~LP 2 ) and the potential for rapid aggregation during bloom conditions. Unlike COBALTv1, phytoplankton aggregation and subsequent sinking is suppressed when nutrients and light are replete (e.g., Waite et al., 1992; see Equations A17 and A18). Aggregated phytoplankton are removed from the phytoplankton pool and added to sinking detritus.

Consumer Dynamics
Consumer dynamics in COBALTv2 (Appendix A.2) were largely maintained from COBALTv1. Three zooplankton size classes are simulated: small zooplankton (SZ) are <200 μm ESD (i.e., microzooplankton), medium zooplankton (MZ) are~200-2,000 μm ESD, and large zooplankton (LZ) and >2,000 μm ESD (e.g., large copepods and krill). Material consumed by zooplankton is partitioned between egestion, respiration/excretion and, if any energy remains, zooplankton production (Equations A15 and A16). MZ and LZ are consumed by an implicit higher trophic level predator (e.g., fish) whose biomass is assumed to vary in proportion with available zooplankton prey (Steele & Henderson, 1992). Similar to coccolithophorids, the production of calcium carbonate shells associated with foraminifera (which fall within SZ) and the production of aragonite detritus associated with pteropods (which fall within MZ and LZ) were parameterized as a function of calcite and aragonite saturation state, respectively, and associated detritus-generating processes (see section 2.1.4).
Biomass-specific maximum grazing rates decrease with zooplankton size in accordance with Hansen et al. (1997). Grazing on a single prey type is modeled with a saturating (i.e., Type II Holling) function with mild density-dependent switching between alternative prey types (Gentleman et al., 2003;Stock et al., 2008;Equation A15). Half saturation constants for zooplankton feeding were held constant across zooplankton types (Hansen et al., 1997) and calibrated to generate realistic global mean patterns in phytoplankton biomass and turnover rates (Stock & Dunne, 2010). Active respiration rates are proportional to ingestion and set to obtain maximum growth efficiencies of 0.4 at high food concentrations (Straile, 1997). Basal respiration is proportional to biomass and calibrated to produce mesozooplankton biomass consistent with observed values in oligotrophic ocean gyres (Stock & Dunne, 2010). All zooplankton rates are assumed to scale with temperature in a manner analogous to phytoplankton. Medium and large zooplankton are given Redfield C:N:P stoichiometry. Small zooplankton are given a value of 18:1 to reflect the high N:P of small phytoplankton prey. Zooplankton iron pools are not tracked.
Zooplankton are assumed to egest 30% of what they consume (Straile, 1997). Egestion is partitioned between dissolved organic material and particulate detritus, with larger organisms producing higher fractions of particulate detritus (Det in Figure 1, e.g., Table A3). The model does not differentiate egestion from "sloppy feeding," as both result in fluxes to dissolved and particulate organic carbon. COBALTv2 resolves three types of dissolved organic material ( Figure 1 and Table A4): a semirefractory component (SRDOM) that decays over multiannual to decadal time-scales; a semilabile component (SLDOM) that decays on seasonal timescales, and a rapidly turned over labile component (LDOM) that is consumed and remineralized by free-living bacteria (B). The partitioning of fluxes to semirefractory and semilabile components was coarsely calibrated for consistency with cross-biome and seasonal variations in dissolved organic material (Abell et al., 2000;Hansell, 2001), respectively. Free-living bacteria are consumed by small zooplankton and subject to an implicit density-dependent virus-driven mortality. Nitrifying bacteria are modeled implicitly, account for ammonia/ammonium partitioning Equation A19).
Respiration by zooplankton and free-living bacteria is associated with oxygen consumption and remineralization of N, P via excretion in proportions required to maintain specified stoichiometry. Activities of both these groups are thus ramped down in low O 2 environments in accordance with oxygen-dependent remineralization described by Laufkötter et al. (2017).

Export From the Euphotic Zone and Sinking
Organic material and minerals are exported from the surface layer via two mechanisms (1) the sinking of particulate detritus (Det, Figure 1) and (2) the downward mixing of dissolved and nonsinking particulate organic material (Appendix A.3). Particulate detritus is generated by zooplankton fecal pellets and the phytoplankton aggregates and is assumed to sink at 100 m day −1 . As in COBALTv1, the presence of biogenic minerals (silica, calcite, and aragonite) and lithogenic dust inhibits remineralization (Armstrong et al., 2001;Dunne et al., 2005;Klaas & Archer, 2002). However, COBALTv2 has replaced the previous oxygen dependence of remineralization used in COBALTv1 with the temperature and oxygen dependence derived in Laufkötter et al. (2017): Remineralization rates increase with temperature with a Q 10 of 1.88 and decrease with oxygen with a half-saturation of 8 μmoles kg −1 until an anoxic remineralization rate (via denitrification) equal to~1/10 of the aerobic value is reached (Equation A22). Remineralization rates are also slowed above 150 m to account for colonization of sinking material (Mislan et al., 2014).
Particulate silica is generated by diatoms and transformed to detrital silica through phytoplankton aggregation and zooplankton fecal pellets ( Figure 1). As described above, generation of calcite and aragonite detritus by coccolithophorids (associated with SP), foraminifera (associated with SZ) and pteropods (associated with MZ and LZ) was estimated as a function of the calcite/aragonite saturation state and detritus-generating processes associated with MZ and LZ (Appendix A.3 and Equation A21). The scaling factors for these relationships were coarsely calibrated to produce sedimentary calcite fluxes consistent with Dunne et al. (2007) ( Table A5). Lithogenic dust is supplied by AM4.1 and rivers as described in section 2.1.1 and translated to detritus through nonselective filter feeding by MZ and LZ. Dissolution of lithogenic dust has a constant length scale, while dissolution of calcite and aragonite detritus as it sinks depends on the associated saturation state (see Appendix A.3 and Equations A23-A24). A temperature dependence was added to the silica dissolution rate in COBALTv2, in accordance with Kamatani (1982).
For iron, a reduced dissolution rate relative to the remineralization rates of organic material was found to yield iron profiles with depth that were more consistent with observations (e.g., see comparisons in Tagliabue et al., 2016). A multiplicative "iron dissolution efficiency" of 0.25, expressing the relationship between iron dissolution (day −1 ) relative to organic matter dissolution rates (day −1 ), was thus maintained from COBALTv1. The depth profile of sinking iron detritus is also shaped by the continuation of scavenging dynamics below the euphotic zone.

Sediment Dynamics
Organic detritus reaching the sediment is either remineralized or buried (Appendix A.4). Burial is calculated based on Dunne et al. (2007). Since this burial function was derived for deep waters, burial was ramped upward to its full predicted values with a half-saturation depth of 50 m. After accounting for burial, the amount of organic material remineralized via denitrification was estimated from Middelburg et al. (1996). The remainder was assumed to be remineralized aerobically if sufficient oxygen was available. If oxygen was insufficient, denitrification was augmented. In the rare case that nitrate was depleted, the remainder of the organic material was assumed to be remineralized through sulfate reduction to hydrogen sulfate, which was tracked as a negative oxygen deficit.
The organic detritus dynamics described above were applied to carbon, nitrogen, and phosphorus in organic material. For iron, the flux from the sediment was set independently from the local flux to the sediment based on the empirical relationship of Dale et al. (2015), which estimates the iron flux as a function of the organic matter flux and bottom oxygen. This is an update from COBALTv1, which estimated the sediment iron flux as a function of the organic matter flux alone (Elrod et al., 2004). A geothermal iron source of~2.2 Gmol Fe yr −1 was also included, intermediate in magnitude between the 0.8 Gmol Fe yr −1 estimated by Tagliabue et al. (2010) and the 13.5 Gmol Fe yr −1 considered by Tagliabue et al. (2014). The geothermal Iron flux was introduced at the ocean bottom in proportion to maps of geothermal heating used to force the physical ocean simulation Davies, 2013).
Calcite reaching the sediment was partitioned between redissolution, storage near the sediment surface, and burial based on the sediment diagenisis metamodel of Dunne, Hales, and Toggweiler (2012). This allows COBALTv2 to simulate century to millennial scale calcium carbonate compensation responses to acidification associated with sediments. Other biogenic minerals (silicate, aragonite) reaching the seafloor are remineralized instantaneously. Lithogenic minerals are buried.

Alkalinity Dynamics and Drift
Alkalinity exerts an important influence on the ocean CO 2 uptake and storage capacity and is impacted by diverse processes described in the previous sections. We thus conclude the model overview with a discussion of alkalinity dynamics to support points raised in section 3.
Changes in the ocean alkalinity inventory in COBALTv2 can be grouped into three categories: (1) direct alkalinity inputs from rivers, (2) alkalinity changes associated with calcium carbonate burial/dissolution to and from the sediment, and (3) alkalinity changes arising from nitrogen redox state changes primarily associated with the creation and remineralization of organic matter. The creation/dissolution of calcium carbonate leads to a decrease/increase of ocean alkalinity by 2 mole equivalents per mole of biogenic calcium carbonate created/dissolved. The creation of calcium carbonate at the surface and its subsequent dissolution at depth thus generates an alkalinity deficit near the surface. Furthermore, net calcium carbonate storage within/dissolution from sediments leads to an overall decrease/increase of ocean alkalinity. As described in section 2.1, an offline calculation was done at the end of the model spin-up to equilibrate the sediment calcite concentrations and resulting burial with riverine inputs (Dunne, Hales, & Toggweiler, 2012). Any subsequent changes in ocean bottom water properties (e.g., bottom waters becoming more acidic) or changes in calcium carbonate flux to the sediment (e.g., suppression of calcite formation due to surface acidification) can drive net addition or loss of alkalinity to/from the sediment.
Alkalinity changes associated with organic matter cycling consists of three primary steps, beginning and ending with NO 3 : 1. An alkalinity increase of 1 mole equivalent for each mole of carbon fixed with NO 3 -based (new) production. 2. A further alkalinity increase of 1 mole equivalent when organic carbon is aerobically remineralized to ammonium. 3. An alkalinity decrease by 2 mole equivalents when organic carbon is fully remineralized to NO 3 via nitrification.
Recycled production with the ammonium created in Step 2 can delay the completion of this cycle, but the eventual return of fixed nitrogen to nitrate leads to no net alkalinity change. Anaerobic remineralization, production via nitrogen fixation, and external inputs/removal of organic matter or reduced nitrogen, however, can create imbalances in the cycle above and drive a net alkalinity change: a. Influx of organic matter from rivers enter after step 1 of the cycle above, imposing a net 1 mole equivalent reduction of alkalinity per mole of organic matter upon eventual remineralization to NO 3 . b. Net influxes of reduced nitrogen (NH 4 ) from the atmosphere enter the ocean after Step 2, resulting in a net 2 mole equivalent reduction of alkalinity per mole of NH 4 upon return to NO 3 . c. Nitrogen fixation by diazotrophs accomplishes Step 1 without a 1 mole alkalinity increase, resulting in a 1 mole equivalent reduction upon return to NO 3 .

d. Accomplishing
Step 2 via denitrification generates a larger alkalinity increase per mole of carbon remineralized than aerobic remineralization, creating a net positive alkalinity shift. e. Organic carbon burial circumvents Steps 2 and 3, leading to a net positive alkalinity shift of 1 mole equivalent per mole of organic carbon buried. f. Net accumulation of nitrogen in organic carbon or reduced nitrogen (NH 4 ) can have an impact similar to burial (i.e., the positive alkalinity effects of Steps 1 and 2 above that is not balanced by a return to nitrate).
For purposes of analyzing the alkalinity drifts and response to CO 2 forcing, the range of mechanisms described in (a)-(f) will be grouped as changes associated with organic matter cycling. The stoichiometric equations underlying the alkalinity changes summarized above are provided as Equations A1-A5.

Observations and Observation-Based Estimates Used for Model Evaluation
We rely on a range of observations and observation-based estimates of biogeochemical quantities and processes to characterize simulation skill (Table 3). We consistently considered three skill metrics: the Pearson correlation to assess pattern agreement, the root mean squared error to assess the model-data spread, and the bias to ascertain systematic errors. A log 10 -transform was used in cases where misfits exhibited high skewness, as is typical for patchy ocean quantities that vary over several orders of magnitude (e.g., Campbell, 1995). We recognize that some of the observation-based products we use reflect considerable processing of raw measurements to derive the quantities of interest (e.g., the neural network-based pCO 2 and carbon flux estimates by Landschützer et al., 2016), but our focus is generally on robust large-scale cross-biome and seasonal patterns similar across products. In cases where significant discrepancies exist, such as with satellite-based chlorophyll estimates (Carr et al., 2006), we use multiple algorithms to illustrate where COBALTv2 sits relative to limiting cases. Additional details for each comparison are provided in section 3.

Results
As outlined in section 1, the results focus on two main objectives: (1) evaluating the simulation's fidelity with observed biogeochemical processes and patterns central to ocean ecosystem stressors and the biological pump (section 3.2) and (2) describing the model's first-order responses to historical forcing and accumulating atmospheric CO 2 (section 3.3). To ensure these elements are interpreted properly, however, we begin in section 3.1 by documenting the model spin-up and drift.  Dissolved organic carbon CLIVAR/GO-SHIP Cruises (see Figure   per century. Significant multidecadal fluctuations in air-sea carbon exchanges were associated with periodic opening and closing of large polynyas in the Ross Sea, which generated periods of outgassing and ingassing, respectively .

Spin-up and Biogeochemical Drift
Ocean alkalinity also exhibited a positive drift (Figure 2c). Over the first 360 years, the alkalinity drift was 0.11 Pg C equivalent yr −1 , somewhat larger than the carbon drift. Offline equilibration of the sediment calcite and adjustment of riverine alkalinity inputs reduced this to 0.063 Pg C equivalent yr −1 . More than half of this remaining drift (0.04 Pg C equivalent yr −1 , Figure 2d) was linked to organic matter cycling. Specifically, the net positive alkalinity signal arising from denitrification (Mechanism d, section 2.1.6) outweighed the negative net alkalinity arising from nitrogen fixation and subsequent remineralization to NO 3 (Mechanism c in section 2.1.6). The surplus of denitrification over nitrogen fixation was related to an overexpression of hypoxic regions (stimulating denitrification) and an overexpression of phosphate limitation (impeding nitrogen fixation) that will be described in section 3.2. Ultimately, the alkalinity drift is still similar in magnitude to the carbon drift, accounting for just 0.019% of the alkalinity inventory per century.
Oxygen exhibits a significant negative drift for much of the spin-up, losing about 10 petamoles over the first 400 yr ( Figure 2e). This drift was also primarily associated with the development of an overly large eastern equatorial Pacific hypoxic zone (see section 3.2) and slows just prior to the preindustrial control simulation. Only 3 petamoles O 2 were lost over the 400 yr piControl, or 0.39% of the ocean inventory per century. Analysis of the underlying fluxes creating this drift showed oxygen production via net autotrophy being counteracted by a slightly larger net O 2 outgassing from the ocean ( Figure 2f). The slowed drift just prior to the piControl was associated with decreased O 2 outgassing. O 2 outgassing also began to exhibit significant decadal to centennial scale variability at this time. Fluctuations were anticorrelated (r = −0.94) with low-frequency variations in the air-sea CO 2 flux ( Figure 2b): the opening of Ross Sea polynyas drives a large outgassing of CO 2 and ingassing of O 2 .
After an initial increase, the oceanic nitrogen inventory decreased over much of the spin-up and preindustrial control (Figure 3a), reflecting a surplus of denitrification over nitrogen fixation and riverine inputs. The rapid increase in denitrification early in the spin-up is associated with the same overexpression of open ocean hypoxic zones that underlies the negative O 2 trend. A slight increase in nitrogen fixation was seen at spin-up year 360, corresponding to the final upward adjustment of phosphorus inputs from rivers to match burial, hinting at a strong role of phosphorus in limiting nitrogen fixation. The resulting −23 Tg N yr −1 nitrogen drift is only a −0.4% change in the nitrogen inventory per century, a significant improvement over previous GFDL ESMs .
The oceanic phosphorus budget in GFDL-ESM4.1 ( Figure 3c) is a simple balance between inputs from rivers and export from burial. Burial exceeded river inputs over the first 200 yr of the simulation, resulting in the loss of 0.27 Pg P. Even this, however, was only a 0.13% change in the ocean P inventory per century. Subsequent calibration of the river phosphorus content to match burial reduced this drift to~0.05 Pg P increase over the next 600 yr, or just 0.02% per century for the piControl, and provided a stable phosphate baseline near observed values.
The global iron inventory increased from 36.4 Tg Fe at the start of the spin-up to 37.2 Tg Fe at the start of the piControl ( Figure 3e). The increase during the piControl was then only 0.2 Tg Fe over the next 400 yr, reflecting a stable balance between diverse iron sources and net removal by the sediments (Figure 3f). We note that the net removal by the sediment is a balance between a very large source and an even larger sink (+12.8 and −13.3 Tg Fe yr −1 , respectively, not shown in Figure 3). Thus, as was the case for COBALTv1 (analyzed in Tagliabue et al., 2016), COBALTv2 belongs to a subset of ESMs featuring relative rapid scavenging and characteristic iron turnover times <10 yr (though deep ocean timescales may be significantly longer than this characteristic rate). The overall iron drift during the preindustrial phase corresponded to only 0.13% per century.
With the exception of carbon drift thresholds considered for C4MIP, there are no broadly accepted targets for biogeochemical drift during the preindustrial control simulation. A drift of any magnitude could arguably be accounted for by subtracting it from projected changes to obtain the climate change signal (we do this in our analysis GFDL-ESM4.1's response to increasing CO 2 in section 3.3). Problems can arise, however, if drifts are so large that the displaced mean biogeochemical state strongly impacts the change signal. For example, a strong positive nutrient drift can lead to biases that make ocean productivity less sensitive to decreasing nutrient fluxes to the surface ocean expected within an increasingly stratified ocean under climate change (e.g., Vancoppenolle et al., 2013). Drifts in all the major carbon quantities, oxygen and limiting nutrients in GFDL-ESM4.1 fell well below 0.5% of the ocean inventory per century, with DIC and alkalinity well below, reducing the risk of such skewed responses.

Model-Data Comparison
The model-data comparison progresses from the ocean surface to depth, covering major components of the ecosystem stressors and biological pump. Comparisons for most quantities were made using 1995-2014 averages. The primary exception was comparisons with carbon system measurements from GLODAPv2, which used 1992-2012 averages to center comparisons around GLODAPv2's 2002 reference year.

Air-Sea Carbon, Oxygen, and Nutrient Exchanges
ESM4.1 simulated air-sea exchange of carbon dioxide and related surface carbon quantities were consistent with observations and widely applied observation-based estimates (Figure 4). The simulated net air-sea flux during 1994-2007 was 2.16 Pg C yr −1 , somewhat lower than the 2.6 ± 0.3 Pg C yr −1 estimated by Gruber et al. (2019) over this period. As observed, simulated CO 2 outgassing was prevalent in warm, high pCO 2 equatorial waters and ingassing prevailed in much of the extratropics. The primary observed exceptions to this extratropical ingassing tendency, the high latitude North Pacific and Southern Ocean, were also well captured by the model.
The simulated seasonal pCO 2 cycle, as reflected by the summer-winter pCO 2 difference, was consistent with neural-network based estimates from Landschützer et al. (2016) (Figure 5). This consistency extended to estimated thermal and biologically driven pCO 2 variations ( Figure 5, Rows 2 and 3): The subtropical and midlatitude pCO 2 cycle was dominated by a temperature-driven patterns, with summer warming leading to pCO 2 outgassing that was slightly larger in ESM4.1 than in Landschützer ( Figure 5, middle panels). Meanwhile, high latitudes were dominated by the biologically driven drawdown of pCO 2 during productive spring and summer months, leading to summer ingassing. This strong high-latitude biological imprint on summer pCO 2 is consistent with observations. In the North Atlantic, it has furthermore been identified as a significant constraint on future carbon uptake (Goris et al., 2018), building confidence in ESM4.1's capacity to estimate ocean CO 2 uptake under climate change.
Simulated dust deposition captures observed variations over 4 orders of magnitude ( Figure 6). These include low fluxes (<0.5 g m −2 yr −1 ) over much of the Southern Ocean, equatorial Pacific, and northeast Pacific, areas identified as iron limited . . Oxidized nitrogen accounts for ⅔ of the net atmospheric supply of nitrogen to the ocean. Upwelling regions off the coast of South America and Africa, the equatorial Pacific, and the Amazon estuary, are net sources of reduced nitrogen to the atmosphere. The source of ammonia to the atmosphere from ocean outgassing is 3.1 Tg N yr −1 in good agreement with estimates derived from COBALTv1 (Paulot et al., 2015).

Surface Ecosystem Dynamics
Simulated annual mean surface macronutrients (nitrate, phosphate, and silicate) in GFDL-ESM4.1 were broadly consistent with observed patterns (Figure 8, top three panels), with correlations exceeding 0.9 in all cases. For nitrate, bias, and root-mean-square error (RMSE) were also small. The most notable discrepancy was overestimation of nitrate and phosphate off the Peruvian and Chilean coasts. This area was iron-deficient ( Figure 8, bottom panel), suggesting that a possible undersupply of iron relative to macronutrients in subsurface waters. This possibility will be explored further in the assessment of subsurface patterns in section 3.2.4. In the North Atlantic, elevated nitrate extended further south in the model than observed. A similar high nitrate bias was evident in the Bay of Bengal.
Surface phosphate concentrations in GFDL-ESM4.1 were highly correlated with observations (r = 0.92), but the model has a moderate low bias (−0.15 mmoles m −3 ) relative to WOA18. The low bias was most prominent in oligotrophic regions of the South Atlantic, Pacific, and Indian Oceans. Simulated values in these regions routinely reach 0.025-0.05 mmoles m −3 , while WOA18 values routinely exceed 0.25 mmoles m −3 .
We note, however, that recent compilations of high sensitivity phosphate measurements suggest that phosphate concentrations < 0.05 mmoles PO 4 m −3 are far more extensive than WOA18, which includes earlier less precise measurements, suggests . The model bias perceived in Figure 8 is thus likely in part the result of data limitations, though we note that phosphate concentrations are low relative to even the Martiny et al. (2019) data in some regions.

Journal of Advances in Modeling Earth Systems
Simulated surface silicate distributions exhibited both a high correlation and low bias. GFDL-ESM4.1 furthermore captured the observed silica contrasts across the high-latitude ocean areas, with the Southern Ocean exhibiting the highest values, followed by the North Pacific and North Atlantic.  Biome-scale variations in annual mean chlorophyll and NPP were captured moderately well by GFDL-ESM4.1 (Figure 9). Simulated annual mean chlorophyll (top row) ranges from <0.1 mg Chl m −3 in subtropical gyres to >1 mg Chl m −3 in highly productive coastal regions. This reflects realistic variations in phytoplankton biomass and chlorophyll to carbon ratios (Appendix B and Figure B1). The simulated range of surface chlorophyll concentrations, however, still falls short of the observed range, with the standard deviation of the simulated chlorophyll concentrations in Figure 9 only 82% of the standard deviation of satellite measurements after averaging onto the model grid. Subsurface chlorophyll maxima in subtropical gyres were also shallower than observed, falling just above a shortwave irradiance of 1 W m −2 ( Figures B2 and B3, Moeller et al., 2019).
Annually integrated NPP (Figure 9, middle row) was 51.1 Pg C yr −1 , near the mean of global estimates (Buitenhuis, Hashioka, et al., 2013;Carr et al., 2006). The latitudinal mean NPP in ESM4.1 was at the lower end of the range of the satellite-based NPP measurements considered herein north of 20°S, and on the high end of the range south of 20°S. Closer inspection of Figure 9 revealed contrasting NPP biases in different parts of the tropics. In the equatorial Pacific, GFDL-ESM4.1 generated high Chl and NPP values relative to satellite-based estimates, while values in the equatorial Atlantic and Indian oceans are lower than those estimated from satellite. These patterns are corroborated by comparison against 13 C and 14 C-based NPP measurements, where the simulated equatorial Pacific NPP was above measured values and Arabian Sea NPP was significantly lower than observed ( Figure 10). ESM4.1's failure to capture high Arabian Sea NPP was the primary cause of reduced correlation with interregional NPP differences relative to satellite-based estimates (r = 0.51 versus 0.8; Figure 10). However, ESM4.1 exhibited low overall bias relative to observed NPP and a RMSE slightly less than the satellite estimate. Breaking down NPP by size class revealed a robustly larger contribution to NPP of large phytoplankton (compared to smaller size classes) in more productive ecosystems (Appendix B and Figure B4), consistent with in situ and satellitebased measurements (e.g., Uitz et al., 2010). Seasonal chlorophyll and NPP cycles were also consistent with observations ( Figure B5).
Analysis of the nutrients limiting phytoplankton growth ( Figure 11, top panel) provided insight into the potential origin of these contrasting equatorial biases. There are numerous similarities between simulated nutrient limitation and the synthesis of nutrient amendment experiments of Moore et al. (2013): the primary Fe-limited High-Nitrogen-Low-Chlorophyll (HNLC) regions in the Southern Ocean, Equatorial Pacific and, to a lesser degree, North Atlantic were well captured. Weak Fe limitation relative to macronutrients also extended to the eastern boundary upwelling systems, consistent with growing evidence for iron limitation in these systems (  Nitrogen (N) was the limiting nutrient in the majority of non-Fe-limited regions, but phosphorus (P) limitation emerged more prominently than Moore et al. (2013) suggests. The largest region of P limitation was the North Atlantic, where strong P limitation occurred in the subtropical gyre and weakly P-limiting conditions radiated outward from this core. Surface phosphate concentrations in the subtropical North Atlantic are exceptionally low  and the area is identified as robustly P colimited by Moore et al. (2013), as have waters within the adjacent to Mediterranean Sea (Ammerman et al., 2003;Thingstad et al., 2005). However, weak P limitation extends further into the subpolar and equatorial North and South Atlantic than observations support. The equatorial extension of North Atlantic P limitation overlies areas of low simulated chlorophyll and NPP relative to observations off central east Africa (Figure 9), and a nitrate surplus (Figure 8). Overexpression of P limitation may also underlie underestimated NPP and chlorophyll in the Indian Ocean, though we note that phosphate-deplete conditions have been observed in some Indian Ocean and Northwest Pacific regions (Kim et al., 2011;Martiny et al., 2019).
Several factors, including high N:P ratios of riverine dissolved nutrient inputs (Seitzinger et al., 2005), the limited capacity of the (1/2)°ocean configuration to retain these inputs in coastal waters (Liu et al., 2019),

Journal of Advances in Modeling Earth Systems
increasing atmospheric N deposition (Figures 2 and 7), and the rigidity of the plankton stoichiometry in COBALTv2 (section 2) may have contributed to the prominence of phosphate limitation in GFDL-ESM4.1. These factors will be discussed in section 4. Fortunately, since nitrate to phosphate ratios in the ocean are generally near Redfield values, the alternation between N and P limitation is often quite subtle such that the overexpression of phosphate limitation in some regions did not lead to widespread biases in  Figure 11. We note that modeling diazotrophs as a single, larger trichodesmium-like group rather than a diverse group of small and large species (Gradoville et al., 2017), and ignoring heterotrophic nitrogen fixation in low oxygen areas (Loescher et al., 2014), may also contribute to low N 2 -fixation biases in ESM4.1.
Small zooplankton consumed 49.5% of available primary production, with the fraction consumed exceeding 60% in the subtropics and dropping to~40% near the poles (Figure 12, top right). This is lower than suggested by dilution experiments synthesized Calbet and Landry (2004) (~75% near the subtropics and~59% near the poles) but latitudinal contrasts and overall prominence are similar. Dissolved organic matter produced by  this microbial food web and secondary contributions from larger organisms yielded dissolved organic carbon distributions that are consistent with observed patterns (Figure 12, bottom row). These, in turn, supported production by free living bacteria within the euphotic zone that was consistently 17.8% of NPP across ocean biomes (Figure 12, top right), consistent with ocean observations summarized in Cole et al. (1988) and Ducklow (1999). The microzooplankton and bacterial biomass associated with these fluxes were furthermore consistent with observations at key ocean time series sites (Appendix B and Figure B6).   Top right: the ratio of free-living bacterial production to primary production. Bottom row: comparison of simulated dissolved organic carbon with measurements from the GO-SHIP/CLIVAR repeat hydrography transects (Barbero et al., 2018;Baringer et al., 2014;Bullister et al., 2010;Feely et al., 1996Feely et al., , 2007Feely et al., , 2008Feely et al., , 2009Feely et al., , 2013aFeely et al., , 2013bHansell, 2015;Sabine et al., 2012;Swift et al., 2014;Tilbrook et al., 2011;Wanninkhof et al., 2017;Wanninkhof, Feely, Dickson, Hansell, et al.,  Moving up the food web, mesozooplankton (medium and large zooplankton in Figure 1) are a key link between phytoplankton production and fisheries yields (Ryther, 1969;Stock et al., 2017). Simulated mesozooplankton biomass and production from the COBALTv2 are modestly correlated with patchy observations (Figure 13, top two panels). The magnitude of mesozooplankton biomass and production, however, is consistent, as are the contrast between oligotrophic subtropical gyres and more productive high latitude and coastal areas. The sharp contrast in mesozooplankton biomass and production between these regions is also apparent as an increase in the ratio of mesozooplankton production to primary production from the subtropics to adjacent regions ( Figure 13, bottom panel). This increase is also evident in data-based estimates (Stock & Dunne, 2010) and is consistent with sharp spatial gradients in observed fisheries catch that greatly exceed gradients in primary production (Ryther, 1969;Stock et al., 2017).
Finally, to conclude the assessment of surface biogeochemical patterns, we note that ESM4.1 skillfully simulated surface dynamics in quantities governing organismal responses to ocean acidification (pH, Ω Calc , Ω Arag , Figure 14). While the spatial correlation with pH was only moderate (r = 0.46), much of this reflects small-scale patchiness in the observed fields that is not evident in the more widely  (Table 3). In high latitudes, observations are overwhelmingly taken during the warmer months. We thus restrict the model average to nonwinter months (all but DJF in the Northern Hemisphere; all but JJA in the Southern Hemisphere) above 35°latitude for all panels. Middle panel: Simulated versus an observation-based estimate of mesozooplankton production. The observation-based estimate of mesozooplankton production was derived following Stock and Dunne, 2010: Mesozooplankton growth rates were derived as an empirical function of chlorophyll and temperature (Hirst & Bunker, 2003), assuming a 25 μg C characteristic mesozooplankton weight. Chlorophyll was derived as in Figure 9 and translated to a euphotic zone average using the relationships of Morel and Berthon (1989). Mesozooplankton growth rates obtained in this way were then multiplied by mesozooplankton biomass to estimate production in mg C m −2 day −1 . Lower panel: The ratio of mesozooplankton production to primary production (the "Z-ratio" defined by Stock & Dunne, 2010).

10.1029/2019MS002043
Journal of Advances in Modeling Earth Systems observed DIC and Alk (i.e., Figure 4) and the simulated bias and RMSE with respect pH are small. Agreement with observed saturation with respect to calcite and aragonite (Ω Calc , Ω Arag ) was strong across all metrics. Ω values <1 indicate an undersaturation with respect to calcite or aragonite (i.e., corrosive waters), which would, among other things, hinder shell and coral formation. ESM4.1 correctly simulated vulnerable regions at high latitudes and in areas of persistent upwelling, particularly eastern boundary upwelling systems.

Export From the Surface Ocean
The average flux of sinking organic carbon particles at 100 m in GFDL-ESM4.1 over 1985-2014 was 6.3 Pg C yr −1 (Figure 15), at the lower end of the 9.6 ± 3.6 Pg C yr −1 suggested by Dunne et al. (2007), greater than estimates of~5 Pg C yr −1 from Henson et al. (2011), and aligned with recent estimates from Siegel et al. (2014, 6 Pg C yr −1 ) and DeVries and Weber (2017, 100 m flux = 6.7 Pg C yr −1 ). This particle flux was augmented by 3.6 Pg C yr −1 of dissolved (and other nonsinking) organic material, or 36% of the carbon flux. This is consistent with estimates from Hansell (2002) suggesting that dissolved organic carbon contributes~30% of the carbon flux to the main thermocline.
Climatological simulated particle export rates were modestly correlated with individual thorium-based export studies compiled by Henson et al. (2019) (Figure 15, top right, r = 0.36 for point-to-point comparisons, r = 0.45 after log-transform), though this was also the case for other widely used export flux algorithms matched to these patchy data (Henson et al., 2012). The model agreed well with latitudinal trends in the thorium estimates. Furthermore, the simulated export of 20-50 mg C m −2 day −1 in subtropical gyres In all cases, simulations results are compared with data from GLODAPv2 (Table 3).
agreed well with measurements suggesting 20-35 mg C m −2 day −1 at the Hawaii Ocean Time Series (HOTS) and the Bermuda Atlantic Time Series (BATS) (e.g., Roman et al., 2001;Steinberg et al., 2001). Export of~100 mg C m −2 day −1 in the eastern and central equatorial Pacific agree well with measurements during the JGOFs EqPAC program of~96 mg C m −2 (Dunne et al., 2005).
Particle export ratios ( Figure 15, bottom panel), defined as the carbon flux from particles at 100 m divided by NPP, are consistent with well-documented latitudinal gradients featuring peak values near the poles, lows approaching~0.05 in the subtropical gyres, and modestly elevated values (~0.1) along the equatorial upwelling (DeVries & Weber, 2017;Dunne et al., 2005;Siegel et al., 2014). These latitudinal patterns are also evident in the Thorium-based export when normalized by the satellite-based monthly NPP climatology derived from the mean of the Carr, VGPM and Eppley-VGPM algorithms, though considerable patchiness reduces the correlation to 0.32 (0.41 after log-transform).
Mineral fluxes at 100 m are summarized in Figure 16. Calcium carbonate fluxes were highest at the equator, reflecting high rates of primary and secondary productivity (Figures 9 and 10) and favorably high Ω Calc and Ω Arag (Figure 14). Highly productive Atlantic Eastern Boundary Upwelling Systems (EBUS) also supported high calcium carbonate export fluxes, but EBUS fluxes were lower in the Pacific due to less favorable saturation states. The overall calcium carbonate flux of 0.40 Pg C yr −1 (0.34 calcite, 0.06 aragonite) is on the low end of prior estimates (0.37-1.8 Pg C yr −1 ), as summarized in Dunne et al. (2007), with the most recent Dunne et al. (2007) estimate suggesting a calcium carbonate flux between 0.37 and 0.67 Pg C yr −1 .
The global silica flux was 87 Tmoles Si yr −1 , just below the range of 88-122 Tmoles Si yr −1 suggested by Tréguer and De La Rocha (2013) but within the 66-136 Tmoles Si yr −1 range estimated by Dunne et al. (2007) and the 70 and 180 Tmoles Si yr −1 suggested by previous studies (Gnanadesikan & Toggweiler, 1999;Heinze et al., 2003;Jin et al., 2006;Nelson et al., 1995). The spatial pattern of silica export reflected a convolution of productivity and silica availability, with the largest fluxes occurring in the high productivity, silica rich Comparisons were made by month for the closest grid cell but were plotted by latitude in the figure to assess latitudinal tendencies. Bottom row: The simulated particle export (pe-) ratio, defined as the carbon flux from particles at 100 m divided by NPP (left panel) and a comparison of simulated pe-ratios (red dots) to those derived by normalizing thorium-based export estimates with NPP derived from the mean of VGPM, Eppley-VGPM, and Carr satellite-based NPP algorithms (black dots). Comparisons were arranged by latitude as in top panel.

Journal of Advances in Modeling Earth Systems
Northwest Pacific ocean. The flux of lithogenic material, in contrast, was highest in the equatorial Atlantic, where a combination of high dust deposition ( Figure 6) and outflows from the Amazon, Orinoco and Mississippi rivers resulted in mineral-rich detrital fluxes from the euphotic zone.

Subsurface Biogeochemistry and Carbon Export to Depth
Skillful resolution of the surface biogeochemical dynamics analyzed in the previous two sections is integral to the accurate projection of future surface ecosystem stressors and the oceanic uptake of atmospheric CO 2 . Most of the ocean's carbon inventory, however, lies at depth. This section thus analyzes the downward fluxes of organic material from the surface, its remineralization at depth, and the ocean properties arising from the convolution of these biogeochemical processes with ocean circulation.
Simulated oxygen and macronutrient distributions in the mesopelagic zone (~500 m) were broadly consistent with observed patterns ( Figure 17). ESM4.1 still overexpresses open ocean hypoxia, similar to CMIP5 models ( Figure 17, top panel; Cabré et al., 2015), but the overall agreement with spatial patterns in subsurface oxygen was strong (r = 0.92). ESM4.1 also simulated the observed enrichment in phosphorus relative to nitrogen arising from gradual denitrification along the Atlantic-to-Pacific pathway of the global thermohaline circulation ( Figure 17, middle panels, the global distribution of denitrification is provided in Appendix B and Figure B7). However, the overexpression of equatorial hypoxia led to excessive nitrate depletion along the eastern boundaries of the Pacific and Atlantic basin. The surplus of phosphorus relative to nitrogen was particularly strong off the Pacific coasts of Mexico and Central America, explaining the sharp nitrogen fixation peak in this region ( Figure 11, bottom panel).
Iron concentrations at 500 m ( Figure 17, bottom row) had relatively uniform values of~0.5 μmoles m −3 . This is consistent with the magnitude of observations, but the model did not capture patchy observed variations around this mean value (r = 0.15). The most notable variation in simulated 500 m iron concentrations were relative low simulated values (0.2-0.4 μmoles m −3 ) in offshore hypoxic waters. High vertical fluxes of detritus in these regions ( Figure 15) resulted in fast iron scavenging rates compared with slow replenishment via remineralization in hypoxic waters. Iron enhancements at 500 m from elevated sedimentary iron inputs under hypoxic conditions (Dale et al., 2015) were largely restricted to coastal regions. The paucity of iron observations in these regions make the realism of this feature difficult to evaluate, but there is little evidence for it in iron transects off Southern Peru. In addition, evidence from other oxygen minimum zones suggests the potential for enhanced iron concentrations in low-oxygen conditions due to the enhanced solubility of Fe (II) redox state iron (e.g., Lohan & Bruland, 2008;Moffett et al., 2015). An underestimation of subsurface iron off Peru would help explain the overestimation of surface nitrate apparent in Figure 8.
The impact of hypoxic zones was strongly reflected in the carbon transfer efficiencies between 100 and 800 m, and between 100 and 2,000 m depth horizons. Approximately 40% of the carbon exported from the euphotic zone over hypoxic zones reached 800 m (Figure 18, upper left panel), and transfer efficiencies remained above 20% at 2,000 m beneath the intense hypoxic zones in the eastern equatorial Pacific (Figure 18, upper right panel). Similarly high transfer efficiencies are maintained in the western part of the subtropical North Atlantic due to large lithogenic mineral inputs from rivers (Amazon, Orinoco, and Missippi) and atmospheric deposition (Figures 6 and 16).
Extratropical transfer efficiencies at 800 m were elevated relative to tropical/subtropical regions without hypoxia or large lithogenic inputs. This is consistent with the findings of Marsay et al. (2015), and the

Journal of Advances in Modeling Earth Systems
temperature-dependent remineralization rate used in COBALTv2 (Laufkötter et al., 2017). The pattern was reversed by 2,000 m, as elevated tropical calcium carbonate fluxes, hypoxia, and lithogenic fluxes remove the signals generated by temperature contrasts in the upper water column. The resulting organic carbon flux at 2000 is 0.43 Pg C yr −1 , shows broad latitudinal patterns consistent with trap data (Figure 18, bottom panel), and is nearly identical to the estimate of Honjo et al. (2008).
The export properties of the "calcium carbonate pump" are summarized in Figure 19. Calcite export at 2,000 m (top left panel) is 0.27 Pg C yr −1 , 84% of surface values, reflecting ubiquitous high transfer efficiencies (top right panel). Attenuation beyond 10% of the flux at 100 m was almost entirely restricted to the  Table 3 for details). Top row: Oxygen. Second row: Nitrate. Third row: The surplus of nitrate relative to phosphate expressed relative to Redfield: NO 3 − 16 × PO 4 . This quantity is related to the quantity N* defined by Gruber and Sarmiento (1997). We have omitted a constant applied by Gruber and Sarmiento to focus on deviation in the NO 3 :PO 4 supply ratio from the Redfield ratio. Bottom row: Dissolved iron.

Journal of Advances in Modeling Earth Systems
Northeast Pacific ocean. This reflected realistically simulated gradients in the calcite saturation state at 2,000 m (Ω calc , second row). The simulated saturation state is biased slightly high in the Atlantic, and low Pacific values extend further south than observed, but cross-basin contrasts are well captured. The simulated fluxes and saturation states produce sedimentary calcite distributions that are broadly consistent with cross-basin patterns observed by Seiter et al. (2004). The highest sedimentary calcite concentrations are found in the Atlantic, moderate values in the Southern Ocean basins, and lowest values are in the corrosive North Pacific. Within all basins, calcite concentrations are highest along the mid-ocean ridges. Comparisons for the less prominent aragonite flux, which shows much lower transfer efficiencies and is not preserved in sediments within ESM4.1, are provided in Appendix B and Figure B8.
The imprint of the biological pumps described above, combined with a robust depiction of ocean water masses , yielded skillful simulations of subsurface macronutrients, oxygen, ocean carbon, and alkalinity profiles across ocean basins (Figures 20-24). Quantities were plotted running north to south along the A16 CLIVAR repeat hydrography transect, turning west to join follow the S04 transect through the Southern Ocean, before turning North at the P16 transect to run south to north through the Pacific Ocean (Figure 20). Nitrate and phosphate (Figure 21, top two panels) both capture low observed concentrations in relatively young North Atlantic Deep Water (NADW), overlain at lower Latitudes by NO 3 and PO 4 -enriched intermediate waters. Low NO 3 /PO 4 deep waters extend southward to 40°S. Consistent with observations, the Southern Ocean had nearly uniform NO 3 and PO 4 at all depths, with the extension of high values to the surface reflecting upwelling and iron limitation in surface waters (i.e., Figure 11). NO 3 and PO 4 become progressively enriched as the transect proceeds northward in the Pacific in a manner consistent with the thermohaline circulation. We note, however, that the skillful representation of deep-water patterns, particularly those in North Pacific, may partly reflect the limited~600 year ocean evolution since initialization (section 3.1, Séférian et al., 2016). The strong denitrification signal associated with the overexpression of hypoxia in the equatorial Pacific (i.e., Figures 17 and B7) is visible as a region of depleted nitrate between 500 and 1,200 m just north of the equator. Like observed values, the simulated SiO 4 nutricline is deeper than those for NO 3 and PO 4 (Figure 21, Row 3), reflecting larger depth-scale for SiO 4 dissolution relative to organic matter remineralization. This results in an accurate simulation of "silicate trapping" in the Southern Ocean (Sarmiento et al., 2004) and depleted silica across most of the intermediate waters of the global ocean. SiO 4 also accumulates in the deep waters of the North Pacific in a manner consistent with observations. The vertical distribution of iron is far less constrained than macronutrients ( Figure 21, bottom row). ESM4.1 simulates the gradient of dissolved iron between nearly depleted values at the ocean surface and enriched values (~0.5-0.7 μmoles) at depth. However, there is little evidence for the simulated patchiness of iron at depth, with some regions dropping below~0.35 μmoles. Consideration of the simulated oxygen transect ( Figure 22) shows that much of the low subsurface iron signal in intermediate depth waters is associated with hypoxia in a manner consistent mechanisms discussed in Figure 17 (low simulated iron dissolution but high iron scavenging onto sinking particles hypoxic waters). The more modest depletion of iron in deeper waters could be to simplifications in  the iron dissolution and ligand/scavenging dynamics, or limited constraints on sedimentary or geothermal processes and subsequent transport (Dale et al., 2015;Resing et al., 2015). We will discuss subsurface iron concentrations further in section 4. We note, however, that the issues highlighted by Figure 21 do not compromise the realism of simulated surface iron limitation of phytoplankton production ( Figure 11).  Table 3. For iron, the mean of observations in each model grid cell was first taken. Cells within five grid points on either side of the transect were then identified, and the average of these taken. Any values from waters <500 m deep were excluded to ensure that averages represented oceanic, rather than coastal conditions, sampled along the transect. Skill metrics were calculated relative to these averages. Finally, values were coarsened onto 50 m depth bins (at the surface) and 500 m depth bins (at depth) and 250 km segments to aid with the visualization provided in the bottom right panel.

Journal of Advances in Modeling Earth Systems
Vertical oxygen profiles ( Figure 22) revealed that the overexpression of equatorial Pacific hypoxia extended to depths of~3,000 m and that North Pacific waters in the upper 1,000 m were more oxygenated than observed. However, ESM4.1's simulation of broad-scale oxygen distribution along these transects is generally robust, with low bias, RMSE and correlations exceeding 0.94 for both oxygen and the apparent oxygen utilization (AOU).
Moving to the carbon system (Figure 23), dissolved inorganic carbon and excess alkalinity signals associated with NADW were well represented in the North Atlantic. Low alkalinity signals associated with low salinity Antarctic Intermediate Water (AAIW) descended from the ocean surface between 50°S and 60°S Latitude in both ocean basins, before extending northward within the ocean thermocline. Low alkalinity signals associated with the relatively fresh North Pacific Intermediate Water (NPIW) were similarly well captured. In deeper waters, DIC and Alkalinity increased from the Atlantic to the Pacific in a manner consistent with observations. Vertically, strong increasing DIC gradients were evident throughout the upper ocean layers, consistent with rapid organic matter remineralization at these depths. Alkalinity gradients, in contrast, sit deeper in the water column reflecting the high transfer efficiency of the calcium carbonate pump to deep waters (i.e., Figure 19). ESM4.1's resulting fidelity with observed excess alkalinity across ocean basins (Figure 23, third row) is indicative of a realistic depiction of the ocean's carrying capacity for DIC. Finally, the vertical distribution of dissolved organic carbon (Figure 23, bottom row) was generally consistent with observations taken during CLIVAR repeat hydrography cruises. This includes the realistic penetration of semirefractory DOC into deep North Atlantic waters.
To further probe ESM4.1's capacity to represent DIC and Alkalinity gradients associated with biological pumps as opposed to salinity-induced gradients, distributions were normalized to a salinity of 35 PSU. We also adjusted for the mean bias to better compare observed versus simulated vertical gradients. ESM4.1's fidelity with observations remained strong (Figure 24), with the consistent contrasts between the shallow DIC gradient and deeper alkalinity gradients becoming increasingly evident. The relatively low calcite flux from the upper ocean (i.e., Figure 16) still produced alkalinity gradients that were consistent with those observed.  Figure 20. AOU was calculated as the difference between simulated/observed oxygen and saturated oxygen conditions.

The Ocean Biogeochemical Response to CO 2
The previous section provided a broad assessment of ESM4.1's capacity to capture observed biogeochemical and ecosystem patterns central to the biological pump and ecosystem stressors. While several issues were identified (e.g., overexpression of hypoxia and phosphate limitation, unresolved iron dynamics in low oxygen waters) the overall fidelity of ESM4.1 with observed ecosystem and biogeochemical patterns firmly supports its use as a research tool, and builds confidence in the use of its projections for policy purposes. This confidence, however, also hinges on the capacity of ESM4.1 to reliably simulate the transient response of ocean ecosystems and biogeochemistry to increasing CO 2 . This section thus compares ESM4.1's uptake of CO 2 over the historical period with observation-based estimates and characterizes the ecosystem's response to historical forcing averaged over years 101-120 of a 1% CO 2 yr −1 increase (i.e., an~200% (3 times) increase over preindustrial atmospheric CO 2 concentrations).
Carbon uptake estimated from the concentration-forced ESM4.1 historical simulation agrees well with published cumulative uptake estimates (Table 4; Figure 20. Skill metrics associated with each comparison are provided in the titles over the right hand panels. Comparisons with DOC were made against simulated averages from 1992-2012. For DOC, point observations were averaged to the ESM4 grid. Cells within five grid points on either side of the transect were then identified, and the average of these taken. Any values from waters <500 m deep were excluded to ensure that averages represented oceanic, rather than coastal conditions, sampled along the transect. Skill metrics were calculated relative to these averages. A five point low-pass filter (i.e., spanning about 2.5°latitude or longitude along the transect) was then applied to the observations for plotting purposes. A 10-point low-pass filter was then used to interpolate over smaller gaps in data coverage.

Journal of Advances in Modeling Earth Systems
ESM4.1's estimates fall below the central estimate for each study but are always within the uncertainty range. However, ESM 4.1's carbon uptake estimates are referenced to 1850 while the estimates of Sabine et al. (2004) and Khatiwala et al. (2009) are referenced to 1791 and 1765, respectively. Adjusting ESM4.1 for pre-1850 uptake and the lasting effects of the unresolved 1850 air-sea disequilibrium  Note. For the Sabine and Khatiwala studies, ESM4.1's estimates reflect carbon uptake between 1850 and the end of the estimation period for each study (1994 and 2008, respectively). Adustments for pre-1850 uptake and the lasting effects of unresolved 1850 air-sea disequilibrium range from~10-20 Pg C to compare with estimates references to 1791 and~10-30 Pg C for estimates references to 1765 (Bronselaer et al., 2017;Le Quéré et al., 2018). Comparison against Gruber et al. (2019) is for the analogous period in ESM4.1.

Journal of Advances in Modeling Earth Systems
STOCK ET AL. (Bronselaer et al., 2017;Le Quéré et al., 2018) yields uptake estimates that both fall within the uncertainty of past estimates and do not show any evidence of systematic bias (Table 4).
The spatial pattern of ESM4.1's carbon uptake is consistent with estimates provided by GLODAPv2 ( Figure 25). Both GLODAP and ESM4.1 show large anthropogenic carbon inventories in the North Atlantic. The primary contrast is along the equatorward edge of the Southern Ocean, near the AAIW formation regions. Both ESM4.1 and GLODAPv2 exhibited elevated carbon inventories in this region, but the ESM4.1 peak was muted relative to GLODAPv2. We note, however, that the GLODAPv2 anthropogenic carbon uptake is high relative to published estimates in Table 5 (168 Pg C in 2002). Further analysis of the ESM4.1's CO 2 uptake is provided in Dunne et al. (2020), and comparisons with other CMIP6 models can be found in Arora et al. (2019).
The response of the surface carbon system, the biological pump, the ocean carbon inventory, and ecosystem stressors to increasing CO 2 is summarized in Table 5. At the ocean surface, pCO 2 predictably responds in near-direct proportion with the increase in atmospheric CO 2 . The surface DIC response over this historical period reflected a Revelle factor of~28.1/2.52 = 11.2, consistent with characteristic values in today's ocean. This contrasts with a significantly larger characteristic Revelle factor of~187/9.66 = 19.4 associated with the DIC response at 3 times preindustrial CO 2 via 1% yr −1 increases. The consequences of acidification and warming on the Revelle factor are only modestly tempered by surface alkalinity increases associated with declines in calcium carbonate production. The much larger magnitude of the DIC increase relative to alkalinity results in steep drop in excess alkalinity, consistent with the increasing Revelle factor.
The organic carbon pump shows a relatively modest change (<10%) relative to the preindustrial control in the historical run but declines exceed 10% at the surface and 30% at 2,000 m for years 101-120 of the 1% CO 2 yr −1 case. The response of the calcium carbonate flux is even greater, with an~19% decrease during the historical period and a~67% decrease at 3 times preindustrial atmospheric CO 2 . This decrease in the

Journal of Advances in Modeling Earth Systems
mineral flux, together with the stimulatory effect of increasing temperature on organic matter remineralization (Laufkötter et al., 2017), underlies the larger fractional organic matter flux declines deeper in the water column (i.e., the remineralization length scale decreases).
The percent change in the total ocean carbon inventory is far more modest than the surface response for both experiments (+0.34% and +1.33%). However, the DIC changes were an order of magnitude larger than alkalinity increases (0.019% and 0.09%), leading to marked declines in excess alkalinity (Alk-DIC) of −6.11% and −23.4% for the years 1995-2014 of the historical simulation and years 101-120 of the 1% CO 2 yr −1 atmospheric CO 2 increase experiment, respectively.
The global ecosystem response is characterized by a 7.24% NPP decline in the high CO 2 case. This is consistent with the range of past model responses but differs from previous GFDL ESMs, which exhibited more stable NPP responses to climate change Laufkötter et al., 2015). This contrast will be discussed in section 4. Percent NPP changes are amplified at higher trophic levels, with mesozooplankton production declining by −11.2% globally in the high CO 2 case. The dynamics underlying this amplification, which has been shown to be a robust outcome across Global ESMs (Kwiatkowski et al., 2019), include declining consumer growth efficiencies and longer food chains in a more stratified, less productive ocean (Stock et al., 2014a). The amplification extends to the flux of material to the benthos (−17.4%) that fuels benthic and demersal fisheries .
The robust decline in ocean oxygen under CO 2 forcing results in a moderate increase in the volume of waters with oxygen <50 mmoles m −3 . Changes in the least oxygenated waters (<5 mmoles m −3 ), are less clear. The Note. The piControl column provides the value of the each quantity under preindustrial conditions. The ΔHistorical denotes the change between the historical period (1995)(1996)(1997)(1998)(1999)(2000)(2001)(2002)(2003)(2004)(2005)(2006)(2007)(2008)(2009)(2010)(2011)(2012)(2013)(2014) and the piControl. This reflects an approximate 30% increase in atmospheric CO 2 relative to preindustrial conditions. The Δ(1% CO 2 yr −1 ) column denotes the change between years 101 and 120 of the of a 1% CO 2 increase per year simulation and the piControl. This reflects a 200% increase in CO 2 (i.e., three times preindustrial CO 2 ), similar to that projected at the end of the 21st century under a high emissions scenario. Changes are expressed in absolute terms, and as a % change relative to the piControl. Projected changes greater than 10% are italicized. Changes greater than 30% are in bold. a Includes both calcite and aragonite fluxes.

Journal of Advances in Modeling Earth Systems
historical run suggests a modest increase in the volume of these least oxygenated waters, consistent with data-based inferences (Oschlies et al., 2018;Schmidtko et al., 2017;Stramma et al., 2012). There is, however, a moderate decrease in the volume of these least-oxygenated waters under strong CO 2 forcing. Inconsistent projected O 2 trends near oxygen minimum zones have been noted in CMIP5 ESMs , which also poorly resolve ventilation pathways in these regions (Cabré et al., 2015; Figure 15). Both the historical (1995-2014) and 1% CO 2 yr −1 to tripling cases have increased AOU, indicating that the positive effect of decreasing particle fluxes on interior oxygen is outweighed by the negative imprint of longer subsurface residence times for deep ocean waters. This was also a robust feature of past CMIP5 and CMIP6 ESMs (Arora et al., 2019;Schwinger et al., 2014). GFDL-ESM4.1 projects consistent increases in the volume of waters under-saturated with respect to calcite and aragonite, with modest increases in the former and moderate increases in the latter.

Discussion
The objectives of this contribution were to (1) communicate the formulation of the ocean biogeochemical component of GFDL-ESM4.1 and the principles underlying it (section 2), (2) quantitatively assess GFDL-ESM4.1's capacity to capture observed biogeochemical patterns central to ocean carbon and ecosystem dynamics (sections 3.1 and 3.2), and (3) document relevant biogeochemical and ecosystem responses to historical forcing and enhanced CO 2 similar to that expected under high emissions scenarios at the end of the century (section 3.3). As discussed in section 2, the biogeochemical model formulation (COBALTv2) reflects tradeoffs between aspirations for a holistic, mechanistic model; and pragmatic considerations of computational costs, CMIP6 timelines, and process uncertainties. The strengths and limitations of COBALTv2 reflect this tension.
Simulations robustly captured large-scale patterns in surface and deep carbon system properties (Figures 4  and 5, and Figures 23 and 24) yielding a carbon uptake over the historical period that agrees with observed constraints (Table 4 and Figure 25). Agreement extends to the competing thermal and biological pCO 2 imprints on seasonal scales ( Figure 5) that have been shown to impose significant constraints on future carbon uptake (Goris et al., 2018). Carbon system properties arose from a generally skillful depiction of surface and deep macronutrients (Figures 8, 17, and 21) and the distribution of iron-limited HNLC regions ( Figure 11). Nutrient and carbon dynamics shaped (and were shaped by) a realistic depiction of cross-biome differences and seasonal patterns in plankton ecosystems (Figures 9-13). These dynamics generated a biological pump that was consistent with large-scale observed constraints (Figures 15,16,18,and 19).
In sum, these generally positive results build confidence in the GFDL-ESM4.1's capacity to estimate the response to dynamic biogeochemical and ecosystem responses to climate change (Table 5). We note, however, that observational constraints are limited, divergent model formulations may be able to produce similar results, and simulations were not without significant deficiencies. Phosphorus limitation was too prominent in some regions (Figure 11), as were open ocean hypoxic regions (Figures 17 and 22). Subsurface iron concentrations exhibited reduced values in oceanic oxygen minimum zones that were not corroborated by observations (Figures 17 and 21). There were significant regional NPP and chlorophyll biases (Figures 9, 10, and B3).
The large-scale fidelity with observed macronutrient, oxygen, and carbon constraints demonstrated herein was broadly consistent with GFDL's previous ESM2 series of ESMs, as were many of the limitations (e.g., the overexpression of open ocean hypoxic zones; Dunne et al., 2013). Analysis of changes in model skill between CMIP5 and CMIP6 conducted by Séférian et al. (2020), however, highlight significant improvements in the simulation of surface iron, nitrate, silicate, and the air-sea exchange of carbon dioxide. Other advances are reflected in the enhanced representation of ocean biogeochemical processes (Table 2), including improved representation of plankton food web processes (Figures 1, 12, and 13), exchanges of nutrients between Earth system components ( Figures 6 and 7), and data-and theory-driven refinements to particle remineralization (Laufkötter et al., 2017) and other aspects of the model formulation ( Table 2).
Resolution of plankton food web dynamics enabled quantitative comparisons with energy flows through the microbial food web ( Figure 12) and mesozooplankton ( Figure 13) critical for resolving the processes underlying projected changes in future fisheries yields (Lotze et al., 2019;Stock et al., 2014aStock et al., , 2017. The systematic calibration of plankton food web dynamics against observed constraints (Stock et al., 2014b;Stock & Dunne, 2010) also generated global net primary productivity (51.1 Pg C yr −1 ) that was substantially lower than GFDL's ESM2 series (83 and 68 Pg C yr −1 ) and more consistent with the majority of satellite-and observation-based estimates (Figures 9 and 10; Buitenhuis, Hashioka, et al., 2013;Carr et al., 2006). It is furthermore notable that the NPP response of ESM4.1 to high CO 2 forcing (−7.71%) contrasts with both of GFDL's ESM2 series models that, unlike most CMIP5 models, showed little NPP change under high emissions scenarios Laufkötter et al., 2015). It is notable that an NPP decline was projected despite the adoption of a temperature-dependent particle remineralization rate. Such a dependence can mediate or, in some cases, reverse projected NPP declines arising from increasingly stratified waters under climate change by enhancing near-surface remineralization (Taucher & Oschlies, 2011). This negative feedback, however, was not strong enough in COBALTv2 to prevent a projected NPP decline.
An additional aspiration for the improved plankton food web dynamics in COBALTv2 was that it might improve the model's capacity to capture full range of observed ocean chlorophyll, which spans over 3 orders of magnitude. However, COBALTv2 fell short of this goal, suffered from significant regional biases (e.g., the high chlorophyll bias in the eastern Pacific equatorial upwelling zone in Figure 9), and underestimated the depth of subsurface chlorophyll maxima in subtropical gyres ( Figure B3). Recent results have suggested that a combination of increased resolution of ocean dynamics and phytoplankton functional types may be needed to address these limitations (Van Oostende et al., 2018). Enhanced representation of top-down controls may also be required to fully capture bloom dynamics (Behrenfeld, 2014) and the depth of the subsurface chlorophyll maximum (Moeller et al., 2019). Increased resolution of plankton food webs and flexible trophic interactions, including feeding by unresolved mixotrophic and gelatinous species, will likely also influence patterns to trophic transfer and trophic amplification evident in Table 5 (Fuchs & Franks, 2010;Hansen et al., 1994;Ward & Follows, 2016).
For the biological pump, perhaps the most significant change from GFDL's ESM2 series of models was the inclusion of temperature-dependent remineralization following the empirical and mechanistic arguments presented by Laufkötter et al. (2017). The implications of this change are evident in the enhanced transfer efficiency to 800 m in cold, high-latitude waters ( Figure 18, consistent with the findings of Marsay et al., 2015), and it is likely reflected in the shoaling remineralization length-scale evident in the biological pump responses to three times preindustrial CO 2 (Table 5, note that the % decrease in the organic matter flux becomes progressively larger with depth). Previous work has illustrated the strong controls on CO 2 -uptake imposed by the remineralization length scale (Kwon et al., 2009). While Laufkötter et al. (2017) established firmer constraints on the environmental dependence of remineralization rates, considerable uncertainty remained. Further tests are needed to understand the sensitivity of responses to the range of temperature and oxygen dependences capable of explaining observed particle remineralization rates similarly well. Similar uncertainties in the response of calcium carbonate production to acidification also require exploration. We further note that ESM4.1 does not yet consider vertical migration, which can have significant impacts on the biological pump and may exhibit complex responses to climate change (e.g., Steinberg & Landry, 2017).
Final notable additions to ESM4.1 relative to previous generations of GFDL's ESMs include a larger number of nutrient sources and exchanges between Earth system components (e.g., Table 2 and Figures 6 and 7). These enable new avenues of inquiry into the drivers of ocean biogeochemical variability and change (Evans et al., 2016;Laufkötter et al., 2018;Paulot et al., 2020), but the increasingly complex mosaic of nutrient sources, often characterized by extreme departures from Redfield stoichiometry, posed a challenge for COBALTv2. The primary manifestation of this was the likely overexpression of the extent of phosphate limitation in the North Atlantic and its appearance in several other regions ( Figure 11). In most cases, the shift to phosphate limitation is subtle such that it does not result in large amounts of excess nitrogen. It does, however, contribute strongly to the low nitrogen fixation bias in the model.
The overexpression of phosphorus limitation likely has several underlying drivers. First, exogenous nutrient inputs were skewed toward N. For rivers, only dissolved inputs were considered and these tend to be rich in N relative to P (Seitzinger et al., 2005). More phosphorus is present in particulate detrital inputs, but we assumed that particulate inputs were buried in nearshore regions. Similarly, while dynamically varying nitrogen supply from the atmosphere was considered (Figure 7), we omitted phosphate deposition. Atmospheric P inputs are small relative to nitrogen, but they are not insignificant (Paytan & McLaughlin, 2007).
A second factor that likely contributed to the overexpression of oceanic phosphate limitation was coarse ocean model resolution. Coarse-resolution ocean simulations (1°to (1/2)°) tend to under-estimate the residence time of continental shelf waters by >25% relative to similarly configured (1/8)°configurations (Liu et al., 2019). This reduces the capacity of sedimentary denitrification in coastal regions to erode high N:P signatures associated with rivers.
Lastly, the rigidity of the plankton N:P stoichiometry in COBALTv2 also contributed to the overexpression of phosphate limitation. This was a simplification from the dynamic N:P variations included in previous generation GFDL ESMs . Stoichiometric values for different phytoplankton types were chosen to reflect mean tendencies with size (Finkel et al., 2009;Quigg et al., 2003) while ensuring mean values were near Redfield averages. Small phytoplankton were thus given a value of 20:1, large phytoplankton 12:1, and diazotrophs, which comprise a small portion of overall plankton productivity, 40:1. The emergent N:P ratio of the particle export approached 18:1 in the oligotrophic gyres, which are dominated by small plankton, and 13:1 in high latitudes dominated by large phytoplankton (Appendix B and Figure B9). These ratios, however, resulted in a global ratio of nitrogen to phosphorus in particle export of~15:1, slightly below Redfield and thus favoring phosphate limitation. This tendency was partly counteracted by high N:P ratios in the surface dissolved organic matter pool ( Figure B9), which contributed 36% of the export across 100 m globally (section 3.2.3), but not by enough to prevent the overexpression of P-limitation. Furthermore, the nitrogen rich nature of the dissolved organic matter pools reduces the potential for dissolved organic phosphorus uptake to alleviate P limitation without invoking differential availability of dissolved organic N and P pools.
Recalibration of fixed N:P ratios offers a rapid and computationally efficient means of calibrating the degree of N and P limitation, but would not capture the full dynamic range of N:P ratios as they respond to dynamic variations in ambient nutrients and cellular requirements in ocean and coastal waters (e.g., Glibert, 2012;Klausmeier et al., 2004;Martiny et al., 2013). The choice to model N:P stoichiometry with fixed ratios in COBALTv2 reflected two pragmatic considerations: (1) allowing N:P ratios to vary would require an additional prognostic state variable for each plankton type and (2) a paucity of observations and process-level understanding and direct observations made a defensible and globally robust formulation difficult to formulate. While computational cost remains a concern, rapid accumulation of measurements elucidating ocean biome and seasonal stoichiometric variations (Martiny et al., 2013;Singh et al., 2015;Talarmin et al., 2016) together with growing mechanistic understanding of observed variations and development of models capturing it, recently reviewed by Moreno and Martiny (2018), have greatly reduced the barriers imposed by the second consideration.
While ESM4.1 included notable advances in linkages between Earth system components to better resolve the ocean to global change, it still omits many others. Riverine nutrient concentrations, for example, were still specified as external forcing from an independent model (Seitzinger et al., 2005). These concentrations reflect contemporary nutrient concentrations. The simulation can capture load changes associated with freshwater flows, but not those associated with land use, fertilizer changes, or sewage inputs. Recent work has addressed this for nitrogen (Lee et al., 2019), but these efforts must be extended to capture the evolution of riverine and groundwater inputs of other ecologically significant nutrients undergoing dynamic changes (e.g., silicon; Laruelle et al., 2009) as well as carbon and alkalinity fluxes. Improved resolution of shelf sea dynamics as global ocean models are refined to (¼)°and beyond (Liu et al., 2019) may allow for an increasingly global perspective on the imprint of these processes.
Similarly, ESM4.1 captured variations in dust and iron due to natural effects of wind and vegetation state (e.g., Evans et al., 2016) but did not capture significant fluxes from industrial inputs (e.g., Matsui et al., 2018). Finally, air-sea exchanges of reduced inorganic nitrogen presents a novel integration of atmosphere and ocean modulated nitrogen exchanges, but ESM4.1 omits the air-sea exchange of organic matter that have implications for both climate and ecosystem processes (e.g., Burrows et al., 2014).
The pace of these and other advances will be dictated by their importance to primary model objectives and tractability. However, a key message of the results herein was the importance of consistency between diverse nutrient sources, and the importance of sufficient model complexity to handle increasingly dynamic nutrient inputs.  (Figures 17 and  22), which has carried over from CMIP5 (Cabré et al., 2015), provides a powerful example of this. The fingerprints of this stubborn bias extend beyond oxygen and radiate outward: Excessive denitrification creates extremely low NO 3 :PO 4 ratios, which create downstream hot spots of nitrogen fixation (Figure 11). The extension of extreme hypoxic zones over great depths drives extremely high organic carbon transfer efficiencies to deep waters ( Figure 19). Recent work has reinforced the hypothesis that higher resolution ocean models may partly address the bias by more fully resolving the equatorial jets ventilating the hypoxic regions (Busecke et al., 2019). Other lines of evidence have pointed to productivity biases linked to exogenous nutrient supplies (Cabré et al., 2015) or vertical zooplankton and fish migrations . Coordinated effort across disciplines and Earth system components will be required to address these challenges and better understand and constrain ocean future.

Appendix A: COBALTv2 Formulation
COBALTv2 is an updated version of COBALTv1, which is described detail in Stock et al. (2014b). This appendix provides a brief overview, a detailed description of new elements, and a complete list of parameter values used for GFDL-ESM4.1 simulations contributed to CMIP6.
Like COBALTv1, COBALTv2 includes 33 prognostic tracers (Table A1). The primary model currency is nitrogen, which is held in a constant 106:16 C:N ratio in all organic material. The ratio between nitrogen and other elements is dynamic for some model components (i.e., variable Fe:N and Si:N ratios for phytoplankton) and static for others (Fixed P:N ratios for each plankton functional type). Further details of the stoichiometric assumptions are provided in the subsections that follow.
The primary transformations associated with the biological pump and the transfer of energy in the plankton food web are the production of organic matter, mainly in the surface ocean, and the remineralization of organic material, which is distributed across depths. Organic matter transformations and associated DIC, alkalinity and oxygen changes are formulated after Anderson (1995). Nitrate-based production is This increases alkalinity by 16 mole equivalents, or 1 mole equivalent per mole of NO − 3 fixed. Note that Equation A1 and those that follow in this section focus on the primary N, C, and O components. P, Fe, and Si are then carried with N in ratios that depend on the plankton functional type and processes (Figure 1 and Appendices A.1-A.3 below). Recycled production based on ammonium is which decreases alkalinity by 1 mole equivalent per mole of NH þ 4 fixed. Aerobic remineralization to NH þ 4 follows the reverse pathway and increases alkalinity by 1 mole equivalent. Remineralization via this pathway can occur via bacterial decomposition or respiration by zooplankton or higher predators (see Appendix A.2 for details). Nitrification of NH þ 4 back to NO − 3 is This decreases alkalinity by 2 units per mole of NH þ 4 oxidized to NO − 3 . Thus, a cycle of NO − 3 -based production (Equation A1), aerobic decomposition to NH þ 4 (Equation A2) and eventual remineralization to nitrate (Equation A3), results in no net alkalinity or oxygen change. Production via nitrogen fixation is: This has no impact on alkalinity. Organic matter created via nitrogen fixation will thus generate a loss of 1 mole equivalent of alkalinity upon eventual remineralization to NO − 3 via Equations A2 and A3. The negative alkalinity drift associated with this pathway is offset by a positive drift associated with denitrification: Increasing alkalinity by 552/472 = 1.17 mole equivalents per mole of nitrate denitrified, or 552/(5 * 16) = 6.9 mole equivalents per mole of organic nitrogen decomposed.
The transformations in Equations A1-A5 are carried out within the plankton food web illustrated in Figure 1. Appendices A.1-A.3 briefly describe food web components and provide tables giving the parameter values for each model element. Footnotes to these tables provide additional details underlying the parameter choices, which are also discussed in detail by Stock et al. (2014b). Parameters in the model are given in standard SI units but have been translated to customary units encountered in the oceanographic literature to aid interpretation and evaluation. External sources and sinks are described in the main text.

A.1 Phytoplankton Growth and Nutrient Uptake
The biomass-specific phytoplankton growth rate (day −1 ) for all phytoplankton is simulated as a function of temperature (T), nutrient limitation (N lim ), and irradiance (I) using a modified version of the dynamic model of phytoplankton growth and acclimation described in Geider et al. (1997): All parameter names and values used in GFDL ESM4.1 are summarized in Table A2. P C m T; N lim ð Þis the irradiance-saturated biomass-specific photosynthetic rate at a given temperature (T) and nutrient limitation (N lim ): where e ktempT scales the maximum growth rate with temperature (Eppley, 1972). We note that, while COBALTv2 uses the temperature scaling of Eppley (1972) (and subsequently confirmed by Bissinger et al., 2008), maximum photosynthetic rates reflect higher values in more recent compilations in Geider Note. Note that elements that have fixed proportions with N (e.g., phosphorus in phytoplankton and zooplankton) do not require prognostic variables since they can be diagnosed from N. This differs from iron and silica in phytoplankton, which vary dynamically relative to N and thus require prognostic variables. Chlorophyll is diagnosed from phytoplankton biomass, growth and light limitation as described in Appendix A.1. et al. (1997), Brush et al. (2002), and Bissinger et al. (2008). The Eppley temperature scaling is also used for the phytoplankton basal metabolic rate (b resp ), and the temperature dependence of all biologically modulated processes (see additional discussion in Stock et al., 2014b). N lim is calculated from the nitrogen, phosphorus, and iron limitation based on Liebig's law of the minimum (Liebig, 1840;see Equations A9-A12). The chlorophyll to carbon ratio follows Geider et al. (1997) but imposes a minimum Chl:C ratio: where I 24 is the 24 hr mean irradiance. Nitrogen limitation is calculated via the parameterization of O'Neill et al. (1989), which, unlike COBALTv1, allows for the inhibition of nitrate uptake in the presence of ammonia, and vice versa: where the resulting nitrogen limitation, NO 3lim +NH 4lim , varies between 0 and 1. As described in Stock et al. (2014b), phosphate limitation of growth is modeled using a saturating Monod relationship:  Geider et al. (1997), Bissinger et al. (2008), Brush et al. (2002), and Morel and Bricaud (1981). b Geider (1992), calibrated to observed bloom timing. c Eppley (1972) and Bissinger et al. (2008). d Cloern et al. (1995) and Behrenfeld et al. (2005). e Used to determine balance of nitrogen fixation versus NO 3 and NH 4 uptake. f Paulot et al. (2015Paulot et al. ( , 2020. g Edwards et al. (2012), Eppley et al. (1969), and Holl and Montoya (2005). h Cotner et al. (1997). i Sarthou et al. (2005) and Kustka et al. (2003). j Martin-Jézéquel et al. (2000), Sarthou et al. (2005), and Mongin et al. (2003). k Huntsman (1995, 1997). l Redfield (1934), Quigg et al. (2003), Iron limitation is modeled based on a saturating sigmoidal relationship relative to the internal cell quota (q Fe2N ) following Sunda and Huntsman (1995). This is expressed as an iron deficiency: N lim is then calculated as the minimum of NO 3lim +NH 4lim , PO 4lim , and def Fe in accordance with Liebig's law.
Nitrogen and phosphorous uptake are determined by the product of the growth rate, biomass, and stoichiometry of each phytoplankton group (Table A2). Nitrogen uptake is partitioned between NO 3 and NH 4 in proportion to their relative limitation. Iron uptake is decoupled from growth and thus determined via Michaelis-Mentin kinetics with half-saturation k fed and maximum uptake proportional to the maximum possible growth rate for a given ocean temperature (Stock et al., 2014b).
Silicate does not limit phytoplankton growth in COBALTv2, but silicate uptake is modeled by asserting that the fraction of the large phytoplankton pool comprised of diatoms and the Si:N ratio of nutrient uptake by diatoms are proportional to the ambient silicate concentration following a saturating relationship of the form: Silicate uptake is then determined as where the first term on the right hand side is the uptake of nitrogen associated with the diatom fraction, and the second term on the right hand side determines the Si:N ratio associated with that uptake. Si2N max is set to match Si:N ratios in silicate-rich waters (~3:1; Table A2).
Diazotroph growth is calculated as described above except (1) N lim is calculated based on phosphorus and iron limitation alone; (2) nitrogen uptake is partitioned between N 2 , NH 4 , NO 3 in accordance with the relative availability of NH 4 and NO 3 ; and (3) diazotrophs are assumed to be limited in low oxygen environments.

A.2 Plankton Food Web Dynamics
Phytoplankton production can be consumed by zooplankton, exuded to dissolved organic matter, or aggregate and sink as detritus ( Figure 1). As described in detail elsewhere (Stock et al., 2014b), grazing is modeled with a saturating function of available prey resources, p, characterized by half-saturation constant (K I ) an allometrically constrained, temperature-dependent maximum biomass-specific ingestion rate (I max ): Zooplankton parameter values used in GFDL-ESM4.1 are summarized in Table A3. The availability of each prey type, Φ i , is determined by size (Hansen et al., 1994) and a mild density-dependent switching (Stock et al., 2008). A small refuge concentration of 0.001 μmoles kg −1 is subtracted from each prey resource before it is entered into Equation A15 to sustain prey diversity under adverse conditions. Thirty percent of ingestion is egested and partitioned between detritus and dissolved organic matter according the partitioning parameters φ det , φ ldon , etc. (Table A3). A larger fraction of egestion contributes to sinking material for larger zooplankton types. COBALTv2 does not differentiate between egestion and "sloppy feeding," which both result in fluxes of prey carbon to particulate or dissolved organic pools. The biomass-specific rate of zooplankton population growth (μ Z ) is determined by a constant maximum growth efficiency (gge max ), the biomass-specific ingestion rate I, and temperature-dependent basal metabolic costs b resp : 10.1029/2019MS002043

Journal of Advances in Modeling Earth Systems
The achieved growth efficiency (i.e., gge = μ Z /I) is thus a function of prey resources and temperature that can respond dynamically to climate change forcing (e.g., Stock et al., 2014a). All remaining ingestion is allocated to active metabolism, resulting in remineralization to NH 4 and PO 4 . Zooplankton consume elements in the ratios provided by the prey, but assimilation is dictated by the N:P ratios of the zooplankton, with any differences reflected in the N:P ratios of remineralization. Silica and iron are not tracked in zooplankton. Consumption of these elements is partitioned to detritus according to φ det and the remainder is returned to the dissolved phase.
Parameter values used for other (nonzooplankton) aspects of the plankton food web dynamics are summarized in Table A4. Exudation is modeled as a constant 13% of NPP (Baines & Pace, 1991; Table A4). Viral losses are modeled as quadratic loss calibrated to consume~5% of SP production and 20% of bacterial production (Fuhrman, 2000;Suttle, 1994). Viral losses are partitioned between dissolved organic matter pools (Table A4). Aggregation is modeled as a quadratic relationship but, unlike COBALTv1, aggregation is suppressed when growth conditions are favorable (e.g., Waite et al., 1992). This is parameterized by first calculating the ratio of the phytoplankton growth rate over 24 hr to a reference rate defined as a fraction (f μ,agg =0.25) of the maximum growth rate under given conditions. For example, for large phytoplankton (LP): Aggregation is then modeled as a quadratic, density-dependent loss term (e.g., Doney et al., 1996) that is suppressed when μ ratio = 1 but increases rapidly when μ ratio < 1: All aggregated material is routed to detritus.
Semirefractory and semilabile dissolved organic matter decays into labile organic matter with characteristic multiannual and seasonal time-scales (Table A4, γ sldon , etc.). Labile organic matter is consumed by free-living bacteria and remineralized with dynamics analogous to Equations A15 and A16, but with  Hansen et al. (1997) calibrated as described in Stock and Dunne (2010). b Uniform temperature dependence of biological processes; see discussion in Stock et al. (2014b). c Straile (1997). d Hansen et al. (1994) and Fuchs and Franks (2010). e Nagata (2000), coarsely calibrated to match observed DON and DOP values in the Pacific and Atlantic (Abell et al., 2000;Libby & Wheeler, 1997;Vidal et al., 1999).
labile dissolved organic matter serving as a single "prey" item. Free living bacteria are consumed by small zooplankton and subject to density-dependent virus-driven losses with rate constant m vir (Table A4). Medium and large zooplankton are consumed by an implicit "higher predator" (i.e., fish) whose biomass is assumed to vary in proportion to available prey (e.g., Steele & Henderson, 1992). Ingestion rates are calculated in manner analogous to zooplankton Equation A15.
Nitrifying bacteria are modeled implicitly as described in Paulot et al. (2020). Briefly, COBALTv2 considers a general form of the nitrification relationship that allows for dependencies on temperature, light (Ward et al., 1982), speciation between ammonia and ammonium (Beman et al., 2011;Ward, 2008) the oxygen requirement for nitrification (i.e., Equation A3): Light is assumed to suppress nitrification following a saturating relationship k nitrif,irr = 10 W m −2 (Stock et al., 2014b, see Figure B2 for the typical depth of this irradiance threshold). A low oxygen half-saturation threshold k nitrif,o2 = 3.9 μmoles kg −1 suppresses nitrification in low oxygen conditions, and a third saturation constant controls k nitrif,nh3 the dependence of nitrification on ammonia substrates. Ammonia is diagnosed from NH 4 pool in COBALTv2 using the chemical disassociation relationships of Clegg and Whitfield (1995) and treating the ammonium pool in COBALT as reflective of the total reduced inorganic nitrogen pool (i.e., note the use of NH x in Equation A19). We note that this introduces a sensitivity of nitrification to acidification (Beman et al., 2011). Finally, since COBALTv2 does not explicitly resolve nitrifying bacteria, we allow for a potential scaling between NH x and nitrifying bacteria (Peng et al., 2016) by setting b = 2 in Equation A19. The value of nitrification rate scaling (γ nitrif ) was then coarsely calibrated to match  Baines and Pace (1991). b Suttle (1994) and Fuhrman (2000). c Stock et al. (2014b). d Same fractionation of fluxes to DOM as zooplankton, Table A3. e Jackson (1990). f Waite et al. (1992). g Abell et al. (2000). h Kirchman (2000), Cajal-Medrano and Maske (2005), del Giorgio and Cole (2000), and Ducklow (2000). i Extrapolated from Hansen et al. (1997) and calibrated as described in Stock and Dunne (2010). j Ward et al. (1982). k Coarsely calibrated within range of observed rates (Yool et al., 2007) to fit NH x observations .
cross-biome and seasonal NH x observations . The resulting nitrification patterns are provided in Figure B7. We recognize that past compilations of nitrification rates (Yool et al., 2007), based on very limited nitrification measurements and relevant covariates, are unable to discern aspects of Equation A19. An assessment of the functional form proposed in Equation A19 against recent measurements that greatly outnumber those compiled in previous studies is underway.

A.3 Detritus and the Biological Pump
Organic detritus is generated by aggregation and zooplankton fecal pellets as described in Appendices A.1 and A.2. This is augmented to a small degree by cell death, which occurs when metabolic costs exceed available energy in Equations A6 and A16. Silica and iron detritus are generated through the same processes. In the case of iron, detritus generated via aggregation and fecal pellets is augmented by the scavenging/adsorption of dissolved iron onto sinking detrital particles: where Fe′ is the free iron concentration (i.e., not bound to ligands) and ndet is organic detritus (see Table A5 for parameter values). This differs from COBALTv1, which assumed a first-order scavenging rate constant. The free iron concentration is determined with a single ligand model (Johnson et al., 1997). As in COBALTv1, the binding strength was modulated between weak high-light (K felig_hl ) and strong low-light (K felig_ll ) limits to mimic the weakening effect that oxygen free radicals have on iron binding in well-lit waters (Fan, 2008). The weakest binding takes place at light levels >10 W m −2 while the strongest occurs when light falls below 0.01 W m −2 (see Figure B2 for the depths corresponding to these thresholds). The ligand concentration is comprised of a background concentration (felig_bkg) and an amount proportional to nonrefractory dissolved organic matter (felig2don). When Fe′ exceeds iron solubility limits defined by as a function of temperature and salinity according to Liu and Millero (2002), scavenging is increased tenfold to mimic the effect or rapid precipitation.
The generation of calcite and aragonite detritus in COBALTv2 has been reformulated to be more closely linked to the organisms and processes creating calcite and aragonite detritus. For the production of calcite detritus, sources are assumed to be detritus generation associated with (a) the consumption of coccolithophorids by zooplankton, (b) the consumption of forams by zooplankton, and (c) the aggregation and sinking of coccolithophorids. To estimate the calcite generation, relevant total fluxes from COBALTv2, which include both calcium carbonate forming and noncalcium carbonate forming organisms, are scaled by a factor proportional to the calcite saturation state (Ω calc ): Parameter values and definitions are provided in Table A5. The saturation state factor in Equation A21 is furthermore limited to a maximum value caco3_sat_max = 10. A similar approach is used for aragonite production associated with pteropods, with the relevant fluxes being detrital production associated with zooplankton and higher predator consumption of medium and large (pteropod-sized) zooplankton. Lithogenic minerals are assumed to be incorporated into lithogenic detritus via filter feeding copepods as described in Stock et al. (2014b).

A.4 Sediment Dynamics
Upon reaching the sediments, a fraction of organic matter is buried following the parameterization of Dunne et al. (2007). Since this formulation was developed primarily for oceanic rather than nearshore conditions, we ramp-up the burial function with a half-saturation depth (z burial = 50 m) to avoid exceedingly high burial in nearshore regions. Other aspects of the treatment of organic matter upon reaching the sediments are as described in Stock et al. (2014b): (a) The relationship of Middelburg et al. (1996) is used for an initial calculation of the portion of the benthic flux that is remineralized via denitrification (Equation A5); (b) if enough oxygen is available, the remainder is assumed to be remineralized aerobically (Equation A2); (c) if O 2 < O 2,min , the remainder is assumed to be remineralized via additional denitrification as long as NO 3 is available; and (d) if both O 2 and NO 3 are depleted, sulfate reduction is assumed to occur. The impact of sulfate reduction is tracked as an oxygen deficit.
Iron reaching the sediment is assumed to be buried. The sediment iron source follows Dale et al. (2015) with additional geothermal inputs based on Tagliabue et al. (2010Tagliabue et al. ( , 2014. Calcite burial, storage, and dissolution dynamics are drawn from Dunne, Hales, et al. (2012).  (2008) and Stock et al. (2014b). c Coarse calibration to Völker and Tagliabue (2015). d Coarse calibration to Dunne et al. (2007). e Martin et al. (1987) and Laufkötter et al. (2017). f Wong et al. (1999) and Betzer et al. (1984). g Gnanadesikan (1999), adjusted for temp. dependence (Kamatani, 1982), calibrated to deep silica profiles in the Southern Ocean. h Laufkötter et al. (2017). i Dunne et al. (2005). (Figures B1-B9) Appendix B contains ancillary model evaluation figures referenced throughout the text. Figure B1 shows the mean simulated phytoplankton C:Chl ratios at the ocean surface. Figure B2 shows the mean depth of ecologically and chemically-relevant irradiance thresholds. Figure B3 provides a comparison of simulated and observed chlorophyll profiles at the Hawaii Ocean Time Series site. Figure B4 shows the fraction of phytoplankton in various plankton functional types. Figure B5 shows the satellite-estimated and simulated seasonal progression of chlorophyll. Figure B6 shows the depth-integrated bacterial and microzooplankton biomass. Figure B7 shows the rates of total denitrification and nitrification. Figure B8 summarizes the aragonite export and saturation states. Figure B9 shows the N:P ratio of particle export at 100 m.    (Ducklow, 1999;Steinberg et al., 2001), with the lower bound corresponding to low carbon conversion factors of~10 fg C cell −1 . Values of~750-1000 mg C m −2 are consistent with observed range of biomass from comprehensive studies conducted during the Joint Global Ocean Flux Study and other comprehensive studies (Ducklow, 1999(Ducklow, , 2000Garrison et al., 2000;Kirchman et al., 1993Kirchman et al., , 1995. Simulated microzooplankton biomass in subtropical gyres (300-500 mg C m −2 ) is consistent with the estimates of Roman et al. (1995) at the Bermuda Atlantic Time Series. Values~600-800 mg C m −2 are below estimates ranging from 900-1500 mg C m −2 in the equatorial Pacific and Arabian Sea (Garrison et al., 2000;Verity et al., 1996). In higher latitudes, microzooplankton biomass of~1,000 mg C m −2 has been observed in the North Pacific (Booth et al., 1993), consistent with simulated values of 800-1,000 mg C m −2

Appendix B: Additional Figures
. Values >2,000 mg C m −3 have been observed during high latitude bloom conditions in the North Atlantic and Southern Ocean (Daniels et al., 2006;Dennett et al., 2001). While these are higher than the annual values in Figure B8, peak simulated monthly values during bloom periods range between 1,000 and 3,000 mg C m −2 (not shown). Finally, we note that simulated microzooplankton concentrations~5-20 mg C m −3 are consistent with ranges reported by Buitenhuis, Vogt, et al. (2013). A tabular synthesis of the observations summarized above, with uncertainty estimates, can be found in Stock et al. (Stock et al., 2014b).   Figure B9. The nitrogen to phosphorus ratio (N:P) of particle export at 100 m and of the surface dissolved organic matter (the sum of labile, semilabile, and semirefractory dissolved organic nitrogen divided by the sum of labile, semilabile, and semirefractory phosphorus). Note the difference in scales. N:P values in detritus approaching 20:1 in the subtropical gyres reflect the dominance of small phytoplankton. Lower values at higher latitudes and productive areas reflect greater prominence of diatoms. Elevated N:P ratios in dissolved organic pools primarily reflect the longer-lived nature of semirefractory DON (e-folding time scale~10 yr) relative to semirefractory dissolved organic phosphorus (e-folding time scale~4 yr). Subtropical peaks of DON:DOP are consistent with observed patterns (e.g., Abell et al., 2000).