Improved Representation of Underwater Light Field and Its Impact on Ecosystem Dynamics: A Study in the North Sea

Understanding ecosystem state on the North ‐ West European (NWE) Shelf is of major importance for both economy and climate research. The purpose of this work is to advance our modeling of in ‐ water optics on the NWE Shelf, with important implications for how we model primary productivity, as well as for assimilation of water ‐ leaving radiances. We implement a stand ‐ alone bio ‐ optical module into the existing coupled physical ‐ biogeochemical model con ﬁ guration. The advantage of the bio ‐ optical module, when compared to the preexisting light scheme is that it resolves the underwater light spectrally and distinguishes between direct and diffuse downwelling streams. The changed underwater light compares better with both satellite and in situ observations. The module lowered the underwater photosynthetically active radiation, decreasing the simulated primary productivity, but overall, the improved underwater light had relatively limited impact on the phytoplankton seasonal dynamics. We showed that the model skill in representing phytoplankton seasonal cycle (e.g., phytoplankton bloom) can be substantially improved either by assimilation of satellite phytoplankton functional type (PFT) chlorophyll, or by assimilating a novel PFT absorption product. Assimilation of the two PFT products yields similar results, with an important difference in the PFT community structure. Both assimilative runs lead to lower plankton biomass and increase the nutrient concentrations. We discuss some future directions on how to improve our model skill in biogeochemistry without using assimilation, for example, by improving nutrient forcing, retuning the model parameters, and using the bio ‐ optical module to provide a two ‐ way physical ‐ biogeochemical coupling, improving the consistency between model physical and biogeochemical components.


Introduction
Ecosystems convert inorganic matter into organic compounds mostly through the process of photosynthesis. The central role of light in photosynthesis implies that any successful marine or terrestrial ecosystem model must be reasonably skilled in representing the basic properties of the incoming light field. Representation of light is of special importance to marine ecosystem models, since the ocean has a large impact on light properties, pathways, and extinction, mostly through backscattering and absorption by water, phytoplankton, sediment, detritus, and particulate organic matter (e.g., Gregg & Casey, 2009;Gregg & Rousseaux, 2016. What exactly happens with light as it penetrates through the ocean and how much of it is used to drive photosynthesis depends largely on its spectral and directional properties at the time when light enters the water column. These spectral and directional properties in turn depend on the atmospheric conditions, in particular on the scattering and absorption by clouds, aerosols, ozone, water vapor, and other atmospheric constituents (Gregg & Casey, 2009;Gregg & Rousseaux, 2016).
Although biological processes depend on the light spectral decomposition (Dickey et al., 2011), most ecosystem models represent the underwater irradiance either by a total short-wave radiation, or photosynthetically active radiation (PAR) Doney et al., 2006;Henson et al., 2010;Laufkötter et al., 2013;Maier-Reimer et al., 2005;Marinov et al., 2010;Palmer & Totterdell, 2001;Zielinski et al., 2002) (see also Gregg & Rousseaux, 2016, for an overview). Furthermore, the direction of the light beam remains often unresolved, which typically equals to the assumption that the incoming radiation is 100% diffuse (e.g., Butenschön et al., 2016). The underwater light field is in those ecosystem models reduced through a broadband attenuation term (e.g., Butenschön et al., 2016;Zhao et al., 2013), which sometimes distinguishes between the roles of water and phytoplankton in light attenuation (Jiang et al., 2003;Manizza et al., 2005;Xiu & Chai, 2014). However, in the last two decades, there have appeared a number of one-dimensional (1D Bissett et al., 1999Bissett et al., , 2005 and three-dimensional (3D; Baird et al., 2016;Dutkiewicz et al., 2015;Gregg, 2002;Gregg & Casey, 2007;Gregg & Rousseaux, 2016Gregg et al., 2003;Mobley et al., 2009) modeling studies where light has been spectrally resolved. For example, a recent study by Gregg and Rousseaux (2016) highlighted the importance of spectrally and directionally resolved light to simulate the global phytoplankton community structure, as well as the global chlorophyll and nutrient abundances. Gregg and Rousseaux (2016) have also shown that global primary productivity can be highly sensitive (on the levels of an order of magnitude) to the wavelength chosen to represent the broadband radiation in a nonspectra resolving light model.
The seas covering continental shelf have often nutrient-rich, biologically productive waters (de Haas et al., 2002), due to high levels of mixing in their shallow bathymetry and river input. North-West European (NWE) Shelf is a region of natural interest to European fisheries and economy (Pauly et al., 2002), and it is also a region of major importance for the carbon cycle and climate (e.g., Borges et al., 2006;Jahnke, 2010). The operational model run on the NWE Shelf consists of physical model Nucleus for European Modelling of Ocean (NEMO) coupled through the Framework for Aquatic Biogeochemical Model (FABM; Bruggeman & Bolding, 2014) to the biogeochemical model the European Regional Seas Ecosystem Model (ERSEM). ERSEM is a popular lower trophic level model to represent Shelf Sea biogeochemistry Baretta et al., 1995;Blackford & Gilbert, 2007;Butenschön et al., 2016;Holt et al., 2012;Polimene et al., 2012;Wakelin et al., 2012), which is the biogeochemical component of the operational model for the NWE Shelf used by the European Copernicus Marine Ecosystem Monitoring Service (CMEMS). ERSEM skill has been repeatedly validated against different types of data (Allen & Somerfield, 2009;De Mora et al., 2013Edwards et al., 2012). However, ERSEM suffers from the same limitations as most marine ecosystem models: in the established ERSEM configuration , the light is taken as purely diffuse radiation (approximation for the higher latitudes) and is spectrally unresolved. The ERSEM model is typically forced by an external atmospheric product for net downwelling short-wave radiation, while underwater light extinction is calculated from light attenuation by clear sea water, four different phytoplankton functional types (PFTs), and aggregate absorption by particulate organic matter (POM), colored dissolved organic matter (CDOM), and sediment forced by an external product . The oversimplified ERSEM light scheme poses limitation on how model represents primary productivity (e.g., concentrations of phytoplankton biomass, magnitude, and timing of phytoplankton bloom). Furthermore, the ERSEM light scheme is not reliable enough to be used in how we calculate the heating of the water column, and a separate scheme from the physical model needs to be used instead. These drawbacks in how ERSEM treats underwater light are particularly concerning on the NWE Shelf, given that the Shelf seems to be one of the global regions with higher sensitivity to the representation of in-water optics (Gregg & Rousseaux, 2016).
In this work, we aim to improve the ERSEM representation of underwater light by implementing a spectrally resolved stand-alone bio-optical module (developed in the context of FABM) into an established NEMO-FABM-ERSEM configuration on the NWE Shelf. In the context of ERSEM, similar efforts have been made in Ciavatta et al. (2014), but this study goes far beyond of what has been done in the early paper of Ciavatta et al. (2014). Unlike Ciavatta et al. (2014), (a) we distinguish between direct and diffuse downwelling streams; (b) we comprehensively resolve light in a broad range of wavelengths including ultraviolet and infrared, while Ciavatta et al. (2014) resolved only three bands, all in a visible range; (c) our model has broad application on the NWE Shelf and beyond, whereas the applicability of the model by Ciavatta et al. (2014) was constrained to the Western English Channel. The bio-optical module implemented in this study is based on Ocean-Atmosphere Spectral Irradiance Model (OASIM) (e.g., Gregg & Casey, 2009) and forced by atmospheric fields, such as ozone, cloud cover and water vapor. It provides ERSEM with spectrally resolved radiation at the water surface and also with improved directional representation of light by decomposing the downwelling radiation into diffuse and direct streams. As outlined before, the most obvious purpose of this development is to improve model skill to represent ecosystem dynamics on the NWE Shelf. This is not, however, the only purpose of introducing the bio-optical module into ERSEM, as the spectrally resolved products for underwater light have their own importance, such as for recreational and commercial diving activities and naval operations (Woodham, 2011). Furthermore, another crucial aspect of this work is data assimilation, which has been increasingly applied in ecosystem modeling (Gehlen et al., 2015).
Data assimilation is a set of tools and methods that enable us to systematically merge the model forecast with the observational data in order to optimally represent the state of a complex dynamical system. It has been developed mostly in the field of numerical weather forecasting (e.g., Kalnay, 2003), but data assimilation has found application in a range of other fields, such as operational oceanography (e.g., Cummings et al., 2009;Edwards et al., 2015). The most typically assimilated data in biogeochemistry are ocean color-derived (satellite) products for surface concentrations of total chlorophyll (Carmillet et al., 2001;Ciavatta et al., 2011Ciavatta et al., , 2016Fontana et al., 2010;Ford et al., 2012;Gregg, 2008;Hoteit et al., 2005;Ishizaka, 1990;Kalaroni et al., 2016;Natvik & Evensen, 2003;Nerger & Gregg, 2007, 2008Pradhan et al., 2019;Torres et al., 2006), and this has been recently extended to phytoplankton functional type (PFT) chlorophyll (Ciavatta et al., 2018Skákala et al., 2018). There is however only a handful of studies that assimilate direct optical radiances, among the few cases there is assimilation of phytoplankton light absorption (Shulman et al., 2013), diffuse light attenuation coefficient (Ciavatta et al., 2014), reflectance data , and absorption by colored dissolved organic carbon (CDOC) (Gregg & Rousseaux, 2017). The advantage of using radiances is that they are often directly measured by the satellite and their products have consequently lower errors (e.g., Groom et al., 2019). However, whether such products can be assimilated into the model depends on how directly we can relate those radiances to the model state variables. One of the important roles played by the bio-optical module introduced in this work is that it provides the necessary link between spectral radiances and biogeochemistry from the NEMO-FABM-ERSEM model, and therefore increases the capacity of our assimilative systems.
In this study, we (a) assessed the impact of the new bio-optical module on ERSEM skill in the "free run" (without assimilation), and (b) assimilated satellite PFT chlorophyll (Brewin et al., 2017) and PFT optical absorption  products into ERSEM using bio-optical module. The runs assimilating satellite products into ERSEM coupled to the bio-optical module will be compared with two ERSEM free runs (with and without the bio-optical module) and with the currently established assimilation of PFT chlorophyll using the preexisting ERSEM light scheme (Ciavatta et al., 2018;Skákala et al., 2018). The ERSEM skill in the different simulations will be assessed by looking at how the model represents the following: (a) phytoplankton concentrations, community structure, and phytoplankton seasonal cycle (e.g., timing and magnitude of the Spring bloom), (b) the underwater light field, and (c) the nutrient cycle. We expect both the bio-optical module and the assimilation to have large impacts on (a) and (b), with lesser, but still important impacts on (c). We also anticipate important changes in the model carbon cycle, but due to lack of data, these can only be evaluated indirectly. In analyzing the differences between assimilating PFT chlorophyll and PFT absorption, it is important to note that the products are not independent: PFT absorption is derived from PFT chlorophyll using the model of Brewin et al. (2019). However, the absorption model used in ERSEM (based on Gregg & Casey, 2009;and Gregg & Rousseaux, 2016) is independent from the satellite model of Brewin et al. (2019) and so the two may not be entirely consistent. The key role played by data assimilation is that it merges information from multiple sources, and by assimilating PFT absorption into the model, we aim to move closer toward the optimal representation of underwater light. However, if there are differences between the ERSEM and satellite absorption models, PFT absorption assimilation might not provide statistically optimal representation of PFT chlorophyll, which in turn should be provided by PFT chlorophyll assimilation. So even though the PFT absorption data used in this study were derived from PFT chlorophyll, assimilating PFT absorption might have some advantages over assimilating PFT chlorophyll, and vice versa.
conditions for physical variables at the Atlantic boundary were taken from a reanalysis of the GloSea5 Seasonal Forecasting System (MacLachlan et al., 2015); the Baltic boundary values were derived from a reanalysis produced by the Danish Meteorological Institute for the Copernicus Marine Environment Monitoring Service (CMEMS). The model (including biogeochemistry) was initialized from the CMEMS reanalysis produced at the Met Office (product CMEMS-NWS-QUID-004-011, https://marine.copernicus.eu/ services-portfolio/access-to-products/). The free simulations were performed for a 3-year period (2016)(2017)(2018) and the more computationally costly assimilative runs for a single year (2016).
The river discharge data set used by  and Skákala et al. (2018) has been updated to cover more recent years using data from Lenhart et al. (2010). Unlike  and Skákala et al. (2018), here the model was forced at the surface by atmospheric fields provided by the high (hourly) temporal and (31 km) spatial resolution realization (HRES) of Copernicus Climate Change Service (C3S, 2017) ERA-5 reanalysis (https://www.ecmwf.int/en/forecasts/datasets/reanalysis-datasets/era5).

The Ecosystem Component: ERSEM
ERSEM (Baretta et al., 1995;Butenschön et al., 2016) is a lower trophic level model for marine biogeochemistry, pelagic plankton, and benthic fauna (Blackford, 1997). It distinguishes between five chemical components: carbon, chlorophyll, nitrogen, phosphorus, and silicon, using variable stoichiometry for the simulated plankton groups (Baretta-Bekker et al., 1997;Geider et al., 1997). The model splits phytoplankton into four functional types largely based on their size (Baretta et al., 1995): picophytoplankton, nanophytoplankton, diatoms, and dinoflagellates. Each PFT biomass is represented in terms of chlorophyll, carbon, nitrogen, and phosphorus, with diatoms also represented by silicon. ERSEM predators are composed of three zooplankton types (mesozooplankton, microzooplankton, and heterotrophic nanoflagellates), with organic material being decomposed by one functional type of heterotrophic bacteria . The ERSEM inorganic component consists of nutrients (nitrate, phosphate, silicate, ammonium, and carbon) and dissolved oxygen. The carbonate system is also included in the model . We used in this study a well-established ERSEM parametrization described in Butenschön et al. (2016). At the Atlantic boundary values for nitrate, phosphate and silicate were taken from World Ocean Atlas (Garcia et al., 2013) and dissolved inorganic carbon from the GLODAP gridded data set (Key et al., 2015;Lauvset et al., 2016), while plankton and detritus variables were set to constant values.
The preexisting light scheme in ERSEM is described in Butenschön et al. (2016): The light is taken as diffuse only, and it is forced by the hourly net downwelling short-wave radiation from the ERA-5 product used to force NEMO as where I surf is the surface downwelling SWR, q PAR is the fraction of PAR, and the exponential term describes the broadband light attenuation by the different components in the water. The light attenuation distinguished absorption and backscattering by pure water and the four PFTs (based on the model of Lee et al., 2005) as where θ zen is zenith angle and absorption (a) and backscattering (b) terms are defined as and backscattering by sea water. The ady term captures the absorption by POM, CDOM, and sediment which is forced by an external product based on (443-nm wavelength) SeaWIFS data  and derived from the bio-optical model of Smyth et al. (2006). However, the preexisting ERSEM light scheme does not attempt to provide any genuine representation of underwater light; it merely focuses on estimating the photosynthetic energy flux through the surface of phytoplankton cells.

Spectrally Resolved Bio-Optical Module
The bio-optical module implemented into NEMO-FABM-ERSEM covers both atmosphere and ocean and is externally forced by bulk meteorological properties. The functionality of the atmospheric module matches that of the widely used Ocean-Atmosphere Spectral Irradiance Model (OASIM; Gregg & Casey, 2009), but additionally allows for arbitrary spectral resolution and, through FABM, for integration in a large number of oceanographic models. At the ocean surface, the module distinguishes two downwelling radiation streams, diffuse and direct, which are both fully spectrally resolved. These streams are then tracked downwards through the water column as they are being absorbed and scattered by water and ecosystem constituents.
The atmospheric fields driving the module were obtained from the Copernicus Climate Change Service (C3S, 2017) ERA-5 reanalysis (https://www.ecmwf.int/). The ERA-5 data came with 3-hourly temporal and 1/4°spatial resolution and covered the following atmospheric constituents (total aggregates per vertical column): ozone, water vapor, cloud liquid water, cloud cover, and the mean sea-level air pressure. These fields were supplemented with data for surface wind speed, air humidity, and air temperature, all provided by the NEMO atmospheric (ERA-5) forcing. In addition to these fields, we provided the module with aerosol optical thickness at 550 nm from MODerate resolution Imaging Spectroradiometer (MODIS) satellite product with monthly resolution (https://modis.gsfc.nasa.gov/data/dataprod).
The underwater irradiance spectra were resolved with 33 wavelengths (between 250 and 3,700 nm), with each spectral band reduced in the water column through backscattering and absorption by water and PFTs, based on the model of Gregg and Rousseaux (2016): and where E d is the direct downwelling stream, E s is the diffuse downwelling stream, C d , C s are light attenuation coefficients, and F d describes forward scattering. In Equations 5 and 6, we neglected all the upwelling terms from Gregg and Rousseaux (2016). Although backscattering is included in light attenuation, the upwelling stream will be only included in the future version of the spectral module. However, backscattering to total scattering ratio was ≤0.007 for all the ERSEM variables (see Table 1), so the upwelling stream could be reasonably neglected.
The module calculated light attenuation by the water components largely following the model of Gregg and Rousseaux (2016). The used scheme is summarized in Table 1. Similarly to the preexisting ERSEM light scheme, the absorption by POM, CDOM, and sediment (nonliving matter, a nl (λ)) was forced by an external product extrapolated from wavelength specific (443 nm) data of Smyth and Artioli (2010) as with S = 0.014 nm −1 , as for Chromophoric Dissolved Organic Carbon (CDOC) in Gregg and Rousseaux (2016).

Data and Validation 2.3.1. Assimilated Data
We assimilated in this study two satellite products: a PFT chlorophyll (Brewin et al., 2017) and a PFT absorption product ( (2017) splits phytoplankton into a subgroup of smaller species (<20 μm, picophytoplankton and nanophytoplankton) and microphytoplankton (diatoms and dinoflagellates) as where C is the total chlorophyll concentration, C 1, 2 is the aggregate concentration of picophytoplankton and nanophytoplankton, C 3,4 is microphytoplankon, C m 1; 2 is the maximum value of C 1,2 approached in the asymptotic limit C → ∞, while the D 1,2 is the fraction C 1,2 /C in the limit of C→0. The C m 1; 2 and D 1, 2 parameters are dependent on the sea surface temperature (SST; see Brewin et al., 2017), which was obtained from the satellite data (OISST version from Reynolds et al., 2007). Using analogous model to Equation 8, one can split C 1, 2 further into nanophytoplankton and picophytoplankton concentrations. Furthermore, the microphytoplankton concentration can be split into diatoms (C 3 ) and dinoflagellates (C 4 ) using where α, β are two suitably tuned parameters (Brewin et al., 2017).
The PFT chlorophyll product was already assimilated in Ciavatta et al. (2018) and Skákala et al. (2018). It has a daily temporal and 4-km spatial resolution and comes with bias and uncertainty estimates (in log-space). Both biases and uncertainties were estimated using in situ and satellite data match-ups following the approach from Jackson et al. (2017) and fuzzy logic statistics (Moore et al., 2009). It has been demonstrated that the PFT chlorophyll biases and uncertainties depend mostly on the optical water type ( OWT Brewin et al., 2017) with higher OWTs describing the optically complex waters typically found in the coastal and Shelf regions, and in the most dynamical time of the year (i.e., during Spring). As expected, the model has larger uncertainties in the higher OWTs; furthermore, the satellite model tends to underestimate chlorophyll concentrations in the lower OWTs and overestimate chlorophyll in the higher OWTs (see Brewin et al., 2017). Similarly to Ciavatta et al. (2018) and Skákala et al. (2018), we unbiased the satellite data prior to assimilation and calculated the uncertainties of the unbiased data (following the method of Ciavatta et al., 2016).
The PFT absorption product has been derived from the unbiased PFT chlorophyll data using the model of Brewin et al. (2019) for the North Atlantic. Based on specific absorption coefficients a * i ðλÞ fitted from in situ measurements, the model of Brewin et al. (2019) derives PFT absorption (a i (λ)) for 12 characteristic wavelengths (λ) as where the i index labels the specific PFT. six out of these 12 wavelengths were selected for the satellite product: 412, 443, 490, 510, 555, and 665 nm. Since the PFT absorption is derived from PFT chlorophyll, it has Note. All spectra are taken from Gregg and Rousseaux (2016) (abbreviated as G&R), Figure 3, and the data files accompanying the source code are available online (https://gmao.gsfc.nasa.gov/reanalysis/MERRA-NOBM/software/). the same spatial and temporal resolution as PFT chlorophyll. Similarly to PFT chlorophyll, PFT absorption also contains information about uncertainties, which have two sources: the original uncertainty of the PFT chlorophyll and additional uncertainty associated with the specific absorption coefficients . PFT absorption therefore has larger uncertainty than PFT chlorophyll.

Validation Data for the Oceanic Variables
We used both shelf-wide and location-specific in situ measured data to assess the model skill. For the shelf-wide data, we used three data sets: (a) the Ecosystem Data Online Warehouse of the International Council for the Exploration of the Sea (ICES, https://www.ices.dk/marine-data/) that contains measurements of three nutrients of specific interest: nitrate, phosphate, silicate, and also data for total chlorophyll. The location-specific data were obtained for the long-term monitoring station L4 in the English Channel (https://www.westernchannelobservatory.org.uk). To support some of the conclusions, we looked at two types of data: weekly climatology derived from 1994 to 2015 time series for total phytoplankton carbon (Widdicombe et al., 2010) and the underwater PAR for Year 2016. The phytoplankton carbon time series was obtained from a measurement location at 10-m depth, while the PAR data were measured across the whole water column (0-50 m). At each day, the PAR data were collected at a specific time in the morning. Since PAR is highly variable throughout the day (diurnal cycle, cloudiness), the observations cannot be directly compared to the model daily average outputs. However, PAR attenuation in the water column is much more stable on the daily time scale, as it depends mainly on the biogeochemical state of the water, which typically evolves on supra-daily scales (with some notable exceptions, such as migration of dinoflagellates). Therefore, we compared model and observations as ratios between the PAR values at a range of depths and the PAR value at 2.4 m. PAR data at depth shallower than 2.4 m were excluded due to insufficient number of observations.

Skill Metrics
Validating a model with in situ data is rarely trivial due to spatio-temporal differences in model and data resolution (e.g., Schutgens et al., 2017). In general, it is expected that bias (difference in model and data mean values) is reasonably unaffected by the different resolutions, but root mean square difference can be substantially increased by the in situ small-scale variability. The impact of small-scale variability can be reduced by suitably binning the observations. This paper focuses on two metrics: bias and bias-corrected RMSD (BC RMSD), which is the same as RMSD from Equation 11, but with bias subtracted from the model. To correct the RMSD for small-scale in situ variability, we decided to bin the in situ data along two dimensions with the largest data variability: the temporal and the vertical dimension. We calculated (bias corrected) RMSD along both temporal (with monthly bins) and vertical axes, and the total (bias corrected) RMSD presented in the paper is a simple average between those. However, this approach was applied only to the ICES data set, as the NSBC monthly climatology has larger spatio-temporal resolution than the model. Comparing the model with the gridded NSBC and satellite data is more straightforward, as all it needs is regridding the finer resolution data set on the coarser resolution scale.

Validation of the Atmospheric Part of the Bio-Optical Module
The bio-optical module was initially validated in the 1D 2013-2016 simulation by the atmospheric data provided by the Western Channel Observatory at the L4 station in the English Channel (some selected results for the atmospheric L4 fields are shown in Table 2). For the whole NWE Shelf, the module is validated in 10.1029/2020JC016122

Journal of Geophysical Research: Oceans
Figures 1 and 2. These two figures compare the total downwelling short-wave radiation (SWR) above the ocean surface between the module output, the ERA-5 product, which is used to force the preexisting ERSEM light scheme, the Interim Climate Data Record (ICDR) SEVIRI sensor product v400/v410 (https://wui.cmsaf.eu/), and the EUropean organization for the exploitation of MeTeorological SATellites (EUMETSAT) incoming daily SWR product from Polar Orbiting Satellites (POS, version 1.9.5, https:// mpimet.mpg.de/cdi). It is shown (Figures 1 and 2) that the module is reasonably consistent in terms of temporal and spatial patterns with the ERA-5 data provided by the CMEMS reanalysis. The relative bias between the bio-optical module and the ERA-5 product (module minus ERA-5) is very small (−0.5 W/m 2 , <1%), with bias-corrected root mean square difference (BC RMSD, Equation 11) around 10% from the median ERA-5 SWR value.   SWR products are not entirely consistent: the EUMETSAT POS SWR matches nicely with both ERA-5 and the module, but all are on average about 10% larger than ICDR SEVIRI data. In both cases, the module slightly outperforms ERA-5 in bias, but slightly underperforms in BC RMSD. The latter is probably due to larger spatial variability in module SWR when compared to both the ERA5 and the two satellite products ( Figure 2). Table 2 further compares the module PAR with MODIS-Aqua Level 3 satellite PAR product (https://oceancolor.gsfc.nasa.gov/l3/) showing very similar skill score to the comparison with EUMETSAT POS in SWR (negligible bias of −0.7%). The last validation data set shown in Table 2 was ICDR SEVIRI product for direct downwelling SWR. Not surprisingly, module skill in representing direct downwelling SWR is consistent with module skill in representing SWR when validated with the same ICDR SEVIRI data. The module relative bias was larger for direct SWR than for total SWR, suggesting that the overestimated direct SWR is the main reason why module overestimated total SWR relative to ICDR SEVIRI data.

The Data Assimilation (DA) System
We used the data assimilation set-up already described in Skákala et al. (2018), which is based on NEMOVAR (Mogensen et al., 2009(Mogensen et al., , 2012Waters et al., 2015), a 3D-VAR variational DA system used for operational ocean DA at the UK Met Office. The 3D-VAR version applied in this study uses the First Guess at Appropriate Time (FGAT) approach and minimizes the cost function using the conjugate gradient method (Mogensen et al., 2012). DA of PFT chlorophyll into NEMO-FABM-ERSEM using NEMOVAR has been implemented at the Met Office for use in reanalysis, and in the future, it will be considered for operational forecasting (see Skákala et al., 2018). The scheme starts with univariate assimilation of four separate PFTs (diatoms, nanophytoplankton, picophytoplankton, dinoflagellates) surface chlorophyll concentrations. The analysis is performed in log-space, taking account of the typical log-normal distribution of chlorophyll concentrations in the ocean (Campbell, 1995). The surface PFT chlorophyll increments are propagated with constant values vertically within the mixed layer. After calculating the increments for force the preexisting ERSEM light scheme, and two satellite products, the EUMETSAT POS (c) and ICDR SEVIRI data (d).

Journal of Geophysical Research: Oceans
PFT chlorophyll, one may use a balancing module to update some additional ERSEM state variables. In the case of PFT chlorophyll assimilation, it is essential to preserve the background stoichiometric ratios between the PFT components (chlorophyll, carbon, nitrogen, phosphorus, silicon), as those ratios reflect on the physiological adaptation of the PFT cells to the environment. The balancing scheme therefore updates the PFT chemical components (other than chlorophyll) as where i labels the four PFTs, X i is a given chemical component of a PFT, C i is the PFT chlorophyll, and ΔC i ,ΔX i are increments of C i , X i , respectively.
The PFT absorption satellite data  were available for six wavelengths (412,443,490,510,555, and 665 nm), and those did not exactly match with the 33 wavelengths used by the bio-optical module.
In order to match the bio-optical module with the satellite product, the PFT absorption values corresponding to the six satellite wavelengths were linearly interpolated from the module outputs prior to assimilation. The univariate assimilation was applied to calculate increments for the 24 (four PFTs times six wavelengths) surface radiances. Since PFT absorption is only a diagnostic variable, PFT absorption increments will have no impact on the model unless they are translated into increments of some model state variables. The model state variables with straightforward relationship to PFT absorption are the PFT chemical components (chlorophyll, carbon, nitrogen, phosphorus, and silicon). It is natural to first update the PFT chlorophyll by transforming PFT absorption increments into PFT chlorophyll increments through the PFT per wavelength specific absorption coefficients (see Equation 10). In principle, different wavelengths could produce different PFT chlorophyll increments, so the unique PFT chlorophyll increment was obtained as an average through the PFT chlorophyll increments calculated from the six wavelengths. After the PFT chlorophyll increments were calculated from the absorption increments, the same balancing scheme as in PFT chlorophyll assimilation provided the increments for the remaining phytoplankton components.
For both PFT chlorophyll (Brewin et al., 2017) and PFT absorption , the per-pixel observation errors were provided with the satellite products. The background errors were estimated by (a) binning the data monthly and into four characteristic regions based on bathymetry: (1) region with <20-m depth, (2) 20-to 50-m depth, (3) 50-to 200-m depth, (4) >200-m depth, and (b) by assuming that background and observation errors can be treated inside each bin as independent. Using (a) and (b), we estimated the background errors by subtracting observation errors from the unbiased differences between model and observations within each bin. A similar scheme is used typically for model diagnostics (e.g., Andersson, 2003;Desroziers et al., 2005); however, due to limited computational resources, we had to estimate the background errors from the free run instead of reanalysis. Such simplified scheme has been found sufficient in this application. In fact, in a previous study (Skákala et al., 2018), where PFT chlorophyll was assimilated into NEMO-FABM-ERSEM with NEMOVAR, we found that the precise formulation of error covariances had relatively small impact on the reanalysis, when compared to some other NEMOVAR system features, especially the formulation of the balancing scheme. Overall, the background-to-observational error ratios varied between different months and the four bathymetric regions, but on average, the background errors were two to three times larger than the observational errors. This is consistent with what has been found in previous studies (e.g., Ford & Barciela, 2017).

Journal of Geophysical Research: Oceans
chlorophyll averaged in 3D space (i.e., across the NWE Shelf) and time (Year 2016), as well as PFT chlorophyll, total phytoplankton and total zooplankton carbon, and nutrients (nitrate and phosphate). Figure 4 shows that even though the bio-optical module moderately increases chlorophyll, it results in lower concentrations of phytoplankton and zooplankton (carbon) biomass, leading to an overall increase in nutrients. This is most likely due to lower primary productivity caused by reduced photosynthetic radiation in the water column, which was found to be 10-20% lower (depending on the season) in the bio-optical module than in the preexisting ERSEM light scheme. Since there is negligible bias in the incoming solar irradiance (Figure 1), we expect that the bio-optical module reduces underwater PAR due to increased light extinction inside the water column. Figure 5 suggests that the module reduces underwater PAR dominantly in two wavebands: the ∼400to 470-nm waveband, which is mostly absorbed by detritus and particulate matter, and the 570-to 700-nm waveband, which is mostly absorbed by the sea water (e.g., Gregg & Casey, 2009;Gregg & Rousseaux, 2016).
The PFT chlorophyll-to-carbon ratio is a good indicator of the environmental (nutrients, irradiance, temperature) impact on PFT cell physiology (e.g., De Mora et al., 2013), with darker environments producing larger chlorophyll values relative to carbon (Finenko et al., 2003). We indeed observed (not shown here) that the reduced PAR in the bio-optical module lead to an overall increase of the PFT chlorophyll-to-carbon ratios. Since the module did not substantially change the PFT community structure (Figure 4), the larger PFT chlorophyll-to-carbon ratios explain the increase in the total phytoplankton chlorophyll-to-carbon ratio from Figure 4.

10.1029/2020JC016122
Journal of Geophysical Research: Oceans Figure 6 shows that the assimilation of satellite PFT chlorophyll and satellite PFT absorption into the bio-optical module has a large impact on PFT community structure and substantially improves the seasonal patterns of the phytoplankton growth. In particular, (a) assimilation moderates the extremity of the model Spring bloom and (b) it moves the Spring bloom by around ∼1 month toward the start of the year. This is consistent not only with the assimilated satellite data but also with the NSBC climatology, as well as with The panels show total phytoplankton and total zooplankton carbon biomass, nutrients (nitrate and phosphate), total chlorophyll-to-carbon ratio, and the PFT community structure (PFT-to-total chlorophyll ratio). The PFT abbreviations are as follows: "Diat": diatoms, "Nano": nanophytoplankton, "Pico": picophytoplankton, and "Dino": dinoflagellates.

Figure 5.
The left-hand panel shows visibility (in meters, defined as 1/Kd) for 490-nm wavelength at the ocean surface. As previously, the panel shows 2016 time series of the spatially averaged value across the NWE Shelf. The panel compares (a) free run using preexisting light scheme ("old scheme"), (b) free run using bio-optical module ("new scheme"), (c) the run assimilating PFT satellite absorption and using bio-optical module ("sat abs assim"), (d) the run assimilating PFT chlorophyll and using bio-optical module ("sat chl assim"), and (e) the run assimilating PFT chlorophyll and using preexisting light scheme ("sat chl assim old scheme"). The runs are compared with the OC satellite data (as previously, the November-February satellite time series were removed due to data sparsity). The right-hand panel shows 2016 Shelf median ocean surface visibility for six outputted wavelengths. Since the visibility of the preexisting light scheme is taken as broadband, the most consistent way of comparing it to the spectrally resolved visibility of the bio-optical module is to represent it across the spectral band with a constant value.

10.1029/2020JC016122
Journal of Geophysical Research: Oceans the seasonality observed in the ICES data (not shown here, but for overall skill score, see Figure 7). However, the light module has only little impact on this improvement in model skill. For example, assimilating PFT chlorophyll using the preexisting light scheme (Skákala et al., 2018) carries similar skill (Figure 7) than the two assimilative runs shown in Figure 6. The changed phytoplankton dynamics in the assimilative runs produces similar annual mean chlorophyll than the free runs ( Figure 4). The difference in total chlorophyll-a is largest between the two assimilative runs (Figure 4), which is due to differences in the specific absorption coefficients used in the satellite algorithm and in the bio-optical module ( Figure 8). As shown in Figure 8, the largest differences in the specific absorption coefficients are for diatoms and picophytoplankton. These differences are responsible for the distinct PFT community structure between the two assimilative runs (Figures 4 and 6). In particular, the relatively lower absorption of picophytoplankton implied by the bio-optical module (Figure 8) produces for PFT absorption assimilation higher picophytoplankton concentrations than the one produced in the PFT chlorophyll assimilative run ( Figure 6). And vice versa, the relatively higher absorption by diatoms (Figure 8) in the bio-optical module increases diatoms concentrations in the PFT absorption assimilative run (Figure 6), when compared to the PFT chlorophyll assimilation. Since diatoms are silicate users, the changed diatom concentrations between the two assimilative runs have substantial impact on silicate concentrations ( Figure 6). Overall, nutrients have the largest concentrations in the assimilative runs, which is explained by the lower plankton biomass in the assimilative runs relative to the free runs (Figure 4).
There are only few observation data for phytoplankton other than chlorophyll, so it is hard to determine whether the changed plankton biomass improves or degrades the model skill. Some indication can be obtained from the data at the specific L4 location in the English Channel, where longer time series exist for total phytoplankton carbon. The L4 phytoplankton carbon time series were available for the period  Figure 9 suggests that the model that uses the preexisting light scheme overestimates primary productivity, with some indications of a small improvement in model skill carried by the bio-optical module (Figure 9). A much more substantial improvement in model skill is then carried by the assimilation of PFT absorption, or PFT chlorophyll using the bio-optical module (Figure 9), which both also outperform PFT chlorophyll assimilation using the preexisting light scheme (Figure 9). Although the results presented in Figure 9 are location specific, the trend in carbon concentrations between the different simulations observed in Figure 9 copies the trend from Figure 4. Figure 9 might therefore indicate that the bio-optical module has a positive impact on how the model represents the carbon cycle. Figure 10 compares the underwater PAR between the free run using the bio-optical module and the two corresponding assimilative runs. It is shown (Figure 10) that the PAR in the water column has been to some degree increased by PFT absorption assimilation (∼10%) and to a slightly lesser degree by PFT chlorophyll assimilation (∼6%). The difference in the light field between the two (PFT absorption vs. PFT chlorophyll) assimilative runs is caused by (a) the difference in PFT community structure with highly absorbing picophytoplankton being more abundant in PFT absorption assimilation, and low absorbing diatoms being more abundant in PFT chlorophyll assimilation (see Figures 4, 6, and 8); and (b) the larger total chlorophyll concentrations in the PFT chlorophyll assimilative run, when compared to the PFT absorption assimilation Figure 7. The skill score (bias vs. BC RMSD) in representing total chlorophyll and nutrients (nitrate, phosphate, silicate) for the different simulations: (a) free run using preexisting light scheme (red), (b) free run using bio-optical module (blue), (c) the run assimilating PFT satellite absorption and using bio-optical module (purple), (d) the run assimilating PFT chlorophyll and using bio-optical module (green), and (e) the run assimilating PFT chlorophyll and using preexisting light scheme (yellow). The different markers represent comparison with different data: ICES data set (star), OC satellite product (circle), and NSBC climatology (diamond).

Journal of Geophysical Research: Oceans
( Figures 4 and 6). The change in community structure also explains why there is more underwater light in the PFT chlorophyll assimilative run than in the free run ( Figure 10), since the smaller phytoplankton size classes (picophytoplankton and nanophytoplankton) that absorb more photosynthetic energy are more prevalent in the PFT chlorophyll assimilative run than in the free run using the bio-optical module (Figure 4).
It is essential to validate the changes to the underwater light field due to the bio-optical module and assimilation. Since it is hard to get shelf-wide data for PAR, we validated the module skill with the available data for the specific L4 location. Figure 11 compares the PAR attenuation in the water column between the two free runs (preexisting light scheme vs. bio-optical module), the two assimilative runs using the bio-optical module and the L4 observations. For the L4 location, we learned ( Figure 11) that (1) as suggested by shelf-wide results, the bio-optical module leads to substantially larger light attenuation than the preexisting light scheme, (2) larger differences in underwater PAR result from the bio-optical module than from the assimilation, (3) the bio-optical module is more skilled to represent the L4 data than the preexisting light scheme, (4) the module skill in representing PAR attenuation depends largely on the season, with the module doing a much better job in the Spring-Summer period than in Autumn-Winter, (5) overall, the light module seems to slightly overestimate PAR attenuation when compared to the observations (especially in the upper 0-30 m), (6) module Kd varies more with depth than the observed Kd. While the results presented in Figure 11 are encouraging and consistent with shelf-wide analysis, it is important to keep in mind the limits of extrapolating general considerations from a location-specific analysis. Furthermore, there were no available L4 data for 2016 phytoplankton, CDOM, POM, or sediment, so it is difficult to explain the difference between the PAR field in the module and the observations.

Journal of Geophysical Research: Oceans
For the shelf-wide model skill, we can compare the model and the OC satellite surface visibility (defined as 1/Kd) for the 490-nm wavelength (the left-hand panel of Figure 5). The preexisting light scheme does not spectrally resolve the underwater light (the visibility in the preexisting scheme is represented as broadband), and it cannot be directly compared with the 490-nm visibility of the bio-optical module, or the satellite data. However, since broadband light attenuation assumes that the broadband value is sufficiently representative of all the wavelengths from the spectral band, we included the preexisting scheme into Figure 5 by representing its broadband visibility with a spectrally constant value. It is then shown that for the 490-nm wavelength: (a) All simulations tend to underestimate surface visibility and therefore underwater light near the ocean surface, (b) the bio-optical module substantially improved the match-ups between free run and satellite visibility at 490 nm, by increasing visibility, (c) PFT chlorophyll assimilation using the preexisting light scheme has little overall capability to improve the near-surface light field, (d) both PFT chlorophyll and PFT absorption assimilative runs using the bio-optical module outperform all the other runs in their match ups with the satellite data. The improved seasonal time series of surface visibility in the two assimilative runs ( Figure 5) is related to the improved phytoplankton seasonal dynamics from Figure 6. However, the relationship between surface visibility and phytoplankton concentration is not straightforward, as light attenuation includes impact of multiple other constituents (POM, CDOM, sediment).
Additional insight into the model and the satellite surface visibility at 490 nm is provided by two spatial figures, Figures 12 and 13. Similarly to Figure 5, Figure 12 shows that the preexisting light module underestimates satellite surface visibility on the NWE Shelf, with substantial improvement carried by the bio-optical module and the two assimilative runs using the bio-optical module. In Figure 13, we show the corresponding total chlorophyll-a surface concentrations, which are (on the NWE Shelf) clearly anticorrelated (Pearson correlation, R=−0.71) with the surface visibility. The relationship between phytoplankton and surface visibility is more visible in the runs using bio-optical module (Figures 12, 13b, and 13c) as the 490-nm visibility Figure 9. This figure compares total phytoplankton carbon at the L4 location at 10-m depth. Since the L4 observations were available only for 1992-2015 period, this figure shows 1992-2015 L4 weekly climatology ("obs clim"). The dashed lines show the interval corresponding to the interannual variability of the L4 data (for each week, the dashed lines represent ± standard deviation around the mean). The L4 phytoplankton carbon data showed no trend, so the observational climatology can be reasonably compared to the model 2016 time series. To avoid the figure being too crowded, we split the time series into two panels, with the upper panel comparing the two free runs: the run using preexisting light module ("old scheme") and the run using bio-optical module ("new scheme"), with the L4 data. The bottom panel compares the assimilative runs: PFT absorption assimilation using the bio-optical module ("sat abs assim"), the PFT chlorophyll assimilation using the bio-optical module ("sat chl assim"), and the PFT chlorophyll assimilation using the preexisting light scheme ("sat chl assim old scheme"), with the L4 data.

10.1029/2020JC016122
Journal of Geophysical Research: Oceans resolved by the bio-optical module is more sensitive to phytoplankton than the broadband visibility of the preexisting light scheme (Figures 12 and 13a). The spatial analysis from Figure 13 also supports the conclusions derived from the time series presented in Figures 6 and 7: on the NWE Shelf, the total chlorophyll-a surface distributions of the assimilative runs match closely with the satellite data.

General Discussion and the Future Directions
While observations suggest that by introducing the spectrally resolved bio-optical module into ERSEM, we improved representation of underwater light field (Figures 5,11,and 12), it is unclear whether the improved light also improved ERSEM ecosystem dynamics. There is some indication of improvement in model primary productivity (Figure 9); however, the overall model skill assessment from Figure 7 shows no significant difference between the performance of the preexisting light scheme and the bio-optical module. It is particularly disappointing that the spectrally resolved module failed to correct the model phytoplankton seasonal cycle, in particular the rapid and late model Spring bloom (as consistently indicated by both satellite and in situ data, e.g., Figure 3). The phytoplankton seasonality is of major importance for ecosystem dynamics as it provides grounds for any higher trophic level processes. Phytoplankon bloom timing and magnitude depend on three key drivers: (a) nutrient availability, (b) vertical mixing, and the (c) light availability. More specifically, the onset of Spring bloom is thought to be primarily dependent on the relationship between the depth penetrated by the solar radiation and some effective depth of phytoplankton mixing, which could be the mixed layer depth (as assumed by the critical depth hypothesis of Sverdrup, 1953), or some effective depth of turbulent mixing (see the critical turbulence hypothesis, Huisman et al., 1999;Waniek, 2003). Clearly, a model late Spring bloom can be indicative of incorrect model vertical mixing. The sensitivity of primary productivity to the upper ocean mixing scheme is well known (Doney et al., 2004;Oschlies & Garçon, 1999) and is often responsible for the deterioration in ecosystem model skill when physical data are assimilated into models (Berline et al., 2007;Samuelsen et al., 2009;Raghukumar et al., 2015). Errors in vertical mixing could potentially delay model Spring blooms, typically when too much mixing reduces Figure 10. Comparison of PAR (400-700 nm) in the water column. The panels show 2016 time series (x-axis) of median values for the NWE Shelf at each depth (y-axis, 0-to 100-m range). The upper panel shows PAR for the free run using the bio-optical module, the middle panel shows the difference between the run assimilating PFT absorption into bio-optical module and the free run using the bio-optical module, and the bottom panel shows the same as the middle panel, but for PFT chlorophyll assimilation. Although the daily differences are large due to changed phytoplankton seasonal cycle, the year-averaged differences are smaller and are shown (in %) within the yellow boxes that appear on the panels. phytoplankton concentrations by transporting its biomass to deeper and darker parts in the water column (Huisman et al., 1999;Smyth et al., 2014;Taylor & Ferrari, 2011). A promising route to address inconsistencies between the underlying physics (e.g., vertical transport) and biogeochemistry is to use the underwater light field calculated from the bio-optical module to drive temperature in the water column and provide very important feedback from biogeochemistry to physics (Edwards et al., 2004;Lengaigne et al., 2007;Sathyendranath et al., 1991;Simonot et al., 1988;Zhai et al., 2011). Such feedback provides us with a two-way coupled physics-biogeochemical model, which we will introduce in the future within the NEMO-FABM-ERSEM context.
Another possibility for the late model bloom is that ERSEM underestimates the underwater PAR and the environment is too dark for an earlier bloom to kick in. Figure 5 indeed suggest that despite of the vanishing module bias in PAR when compared to EUMETSAT POS data, the module overestimates the light attenuation near the water surface. It seems (Figure 12) that the module substantially overestimates light attenuation by some of the represented substances, and this overestimate is large enough to compensate for the POM and sediment backscattering, which was not included into light attenuation within this study. Some further model development might be therefore required in order to include the impact of sediment backscattering, while simultaneously increasing the underwater light field. One important clue might be the mismatch in diatoms and picophytoplankton specific absorption coefficients between the module and the satellite (Figure 8), which points to the large uncertainty in phytoplankton absorption that will need to be addressed in the future. A separate question is the ERSEM PFT community structure that substantially impacts the absorption of light by phytoplankton. The ERSEM PFT community structure is also sensitive to the model parametrization, in particular to the relatively poorly constrained maximum chlorophyll-tocarbon ratio parameters (Ciavatta et al., 2014). However, after all, increasing underwater PAR might not have substantial impact on the late phytoplankton bloom, as the preexisting ERSEM light scheme had Figure 11. This figure shows for the L4 location how PAR reduces in the water column relative to its value at the 2.4-m depth (observation data above 2.4 m were too sparse to be used). This figure compares observed data with four model runs: free run using preexisting light scheme ("old scheme"), free run using bio-optical module ("new scheme"), PFT absorption assimilative run using bio-optical module ("sat abs assim"), and PFT chlorophyll assimilative run using bio-optical module ("sat chl assim"). Each panel shows an average for a different season. The x-axis is on a log-scale, which means the slope of the curve is related to Kd.

10.1029/2020JC016122
Journal of Geophysical Research: Oceans 10-20% higher underwater PAR than the module and the phytoplankton seasonal cycle remained broadly unchanged.
The model phytoplankton bloom has also much larger magnitude than what has been observed both in the satellite and in situ data (e.g., Figure 3). This may be explained by the fact that ERSEM substantially overestimates the nitrate concentrations on the NWE Shelf (e.g., Ciavatta et al., 2018;Skákala et al., 2018; see also Figure 7), providing too much nutrients for the phytoplankton growth. The issue of ERSEM nitrate is unrelated to how the model represents chlorophyll, and improving the ERSEM chlorophyll by assimilation is known to further degrade the nitrate model bias (Ciavatta et al., 2018;Skákala et al., 2018). We anticipate that to improve the ERSEM nitrate concentrations, one needs to either focus on the model forcing (river discharge, nitrogen atmospheric deposition), or on some relevant model parameters, such as nitrification rate.
There is also a more general possibility that the current ERSEM version neglects important processes with substantial impact on phytoplankton seasonality. One such process could be dinoflagellate motility; another process that has been already shown to have positive impact on the simulation of phytoplankton succession is the explicit representation of different xanthophyll photoprotective activities in phytoplankton groups . Finally, it is quite possible that to improve ecosystem dynamics under the new module, one needs a more complex ERSEM reparametrization. This would not be entirely surprising as the current set of ERSEM parameter values has been chosen to optimize the skill of the model using the preexisting light scheme and such parametrization might easily become suboptimal once one introduces large changes to the model. Since model development and parametrization is an arduous task, it is encouraging that it can be partly bypassed by data assimilation, with assimilative runs substantially outperforming free runs in chlorophyll (Figures 6, 7, and 13), underwater light field ( Figure 5) and possibly also primary productivity Figure 12. Annual 2016 median distributions of surface visibility (m) for the 490-nm wavelength, defined as the inverse of Kd at the satellite data locations. The different panels compare (a) free run using the preexisting light scheme ("old scheme"), (b) the free run using the bio-optical module ("new scheme"), (c) the run assimilating PFT absorption ("sat abs asim"), and (d) the OC satellite data. The run assimilating the PFT chlorophyll-a into the bio-optical module is not shown, as the distributions are virtually identical to the absorption assimilation (bottom left panel). The full black line marks the boundary of the continental shelf (<200-m bathymtetry).

10.1029/2020JC016122
Journal of Geophysical Research: Oceans ( Figure 9). In fact, improvement in our understanding of ecosystem dynamics (e.g., carbon cycle, nutrient cycle, and trophic export) should be ideally understood as a hand-in-hand effort between the model development and the new types of observational products advancing our assimilation capability. Here, the bio-optical module plays an important role in multiple aspects of this process: It improves the model, provides us with a better capacity to assimilate new data into the model, and potentially, in the future, it could help to develop, or validate, observational algorithms used to derive biogeochemical fields of interest.

Summary
In this work, we introduced a novel bio-optical module into an ecosystem model that is used for operational reanalysis and forecasts on the NWE Shelf. The two main advantages of the bio-optical module are that it (a) spectrally resolves the light in the water-column and (b) better accounts for the direction of the light beam. The new module improves the simulation of the underwater light field by providing its spectral decomposition and improving the light attenuation in the water column. The improved representation of the underwater light changes the simulated primary productivity, and there is some evidence that the changed primary productivity improves phytoplankton carbon biomass. Much greater improvement in model skill is achieved through assimilating either satellite PFT chlorophyll, or PFT absorption, with both assimilative runs having major positive impact on the model skill to represent chlorophyll seasonal cycle (i.e., the timing and magnitude of Spring bloom), underwater light attenuation and possibly also phytoplankton carbon biomass. The model skill to represent biogeochemical variables is improved dominantly by assimilation, while the model skill to represent underwater light field is improved primarily by the bio-optical module. The importance of the bio-optical module is particularly evident with respect to the currently established assimilation of PFT chlorophyll using the preexisting ERSEM light scheme, as we have shown that this fails to correct the underwater light field of the free run. We The different panels compare (a) free run using the preexisting light scheme ("old scheme"), (b) the free run using the bio-optical module ("new scheme"), (c) the run assimilating PFT absorption ("sat abs asim"), and (d) the OC satellite data. The run assimilating the PFT chlorophyll-a into the bio-optical module is not shown, as the distributions are virtually identical to the absorption assimilation (bottom left panel).
suggest that model simulation of phytoplankton seasonal cycle could be further improved by retuning ERSEM parametrization (e.g., addressing the values of phytoplankton specific absorption coefficients), improving nutrient forcing (e.g., river discharge), and improving the underlying physics (e.g., vertical mixing). The latter could be potentially addressed by using the bio-optical module to correct temperature profiles in a fully two-way coupled physical-biogeochemical model.

Data Availability Statement
The ERA-5 atmospheric data used to force the bio-optical module can be freely downloaded from https:// www.ecmwf.int/, the MODIS data for aerosol optical thickness from https://modis.gsfc.nasa.gov/-data/dataprod, the NSBC climatological data-set used for model validation can be downladed from https://icdc.cen. uni-hamburg.de/1/daten/ocean/knsc-hydrographic0/, and the ICES data from their web site (https:www. ices.dk/marine-data/). The L4 validation data can be obtained from the Western Channel Observatory (https://www.westernchannelobservatory.org.uk/). The outputs for the NEMO-FABM-ERSEM simulations are stored on the MONSooN storage facility MASS and can be obtained upon request.