Surface and Atmospheric Driven Variability of the Single‐Layer Urban Canopy Model Under Clear‐Sky Conditions Over London

Urban canopy models (UCMs) are parametrization schemes that are used to improve weather forecasts in urban areas. The performance of UCMs depends on understanding potential uncertainty sources that can generally originate from the (a) urban surface parameters, (b) atmospheric forcing, and (c) physical description. Here, we investigate the relative importance of surface and atmospheric driven model sensitivities of the single‐layer urban canopy model when fully interactive with a 1‐D configuration of the Weather Research and Forecasting model (WRF). The impact of different physical descriptions in UCMs and other key parameterization schemes of WRF is considered. As a case study, we use a 54‐hr period with clear‐sky conditions over London. Our analysis is focused on the surface radiation and energy flux partitioning and the intensity of turbulent mixing. The impact of changes in atmospheric forcing and surface parameter values on model performance appears to be comparable in magnitude. The advection of potential temperature, aerosol optical depth, exchange coefficient and roughness length for heat, surface albedo, and the anthropogenic heat flux are the most influential. Some atmospheric forcing variations have similar impact on the key physical processes as changes in surface parameters. Hence, error compensation may occur if one optimizes model performance using a single variable or combinations that have potential for carryover effects (e.g., temperature). Process diagrams help differences to be understood in the physical description of different UCMs, boundary layer, and radiation schemes and between the model and the observations.


Introduction
Urban canopy models (UCM) are essential components of many numerical weather prediction (NWP) models as they represent subgrid scale physical process of the urban fabric. However, given their complexity, UCMs' performance is not always well understood. Large variations in model performance have been reported between different UCMs  with similar configurations for the urban surface parameters (Best & Grimmond, 2015;Grimmond et al., 2010) and between the same UCM with different configurations Loridan et al., 2010).
One source of uncertainty in the performance of a UCM originates from the complexity of the representation of the urban surface in the UCM (Best & Grimmond, 2015). The complexity ranges from bulk schemes that only account for basic surface parameters to multilayer schemes accounting for building drag effects and street-canyon orientation (Jarvi et al., 2011;Kusaka et al., 2001;Martilli et al., 2002;Masson, 2000;Salamanca & Martilli, 2009). The simpler physical description in less complex UCMs could potentially lead to worse performance from incomplete representation of physical mechanisms in the urban environment. Best and Grimmond (2015) showed that differences in complexity between UCM might not be the primary source of model bias. Their findings, supported also by those of Loridan et al. (2010), Demuzere et al. (2017), and Ronda et al. (2017), suggest that adequate prescription of urban surface parameters is equally essential (if not more) for reducing model biases. The surface parameters are also linked to key physical processes, like radiation absorption (e.g., surface albedo and emissivity) and surface energy partitioning. Given the plethora of parameters that are needed to be known, disentangling the contribution of each parameter on model performance can be time consuming (Loridan et al., 2010;Loridan & Grimmond, 2012;Zhao et al., 2014). An offline multioptimization that minimizes errors due to uncertainties in surface parameters in a UCM is possible (Loridan et al., 2010). However, understanding the contribution of all surface parameter changes to key physical process in the urban fabric, and the coupling with the overlying atmosphere is virtually an impossible task.
Complexity increases further when UCMs are coupled to the atmosphere. In such cases an offline multioptimization approach might not have the expected improvement in model performance, as model response to changes in parameters setting varies substantially depending on whether a UCM is coupled to the atmosphere or not (Tsiringakis et al., 2019). However, when coupled to a NWP model, UCM uncertainty increases also from variations in atmospheric forcing provided by the different parameterization schemes coupled with the UCM. Ferrero et al. (2018) found that BEP and BEP + BEM models perform better when coupled to TKE-based boundary layer schemes. This highlights that the turbulent mixing intensity strongly affects model performance, through modification of near-surface atmospheric forcing. Such findings are also supported by Sterk et al. (2013) and Bosveld et al. (2014). Therefore, it is not surprising that in-depth knowledge on UCM performance when coupled to NWPs is still limited. It is essential to know which uncertainty sources could have the largest impact on the key radiative, surface energy, and turbulent mixing processes. We investigate if different uncertainty sources have similar impacts on model variability to identify compensating effects on model bias. Changes in surface parameters and atmospheric forcing on three key physical process are tested. The results are utilized to identify why model performance varies between different UCMs coupled to the same NWP and between the same UCM coupled to different boundary layer and radiation schemes existing in the NWP model.
The impact of changes in surface parameters and atmospheric forcing on the surface radiation balance, surface flux partitioning, and turbulent mixing in the single-layer urban canopy model (SLUCM, Kusaka et al., 2001) coupled to the Weather Research and Forecasting (WRF) model (Powers et al., 2017;Skamarock et al., 2008) is studied using a case study (section 2). A brief evaluation with observations is conducted prior to an in-depth analysis of the model variability caused by changes in surface parameters and atmospheric forcing (section 3). We explore how physical processes modified by changes in the atmospheric forcing and surface parameters can explain variations between different parameterization schemes (section 4). Finally discussion and conclusions are drawn (sections 5and 6).

Case Study Description
A clear-sky 54-hr period (00:00 UTC 23 July 2012 to 06:00 UTC 25 July 2012) from the SUBLIME study Tsiringakis et al., 2019) is used. A spin-up time of 6 hr is used to allow building temperatures adjust to the changes in atmospheric forcing components and surface parameters. The first 24-hr (06:00 UTC 23 July 2012 to 05:00 UTC 24 July 2012) are used for the primary focus of this paper (sections 3.2, 3.3, 3.4, and 4). For the following 24-hr (06:00 UTC 24 July 2012 to 05:00 UTC 25 July 2012) the sensitivity analyses are repeated. This latter material and comparison to the prior day is presented in the supporting information.
Observations taken at the King's College London measurement site (KSSW) (Kotthaus & Grimmond, 2014a, 2014b include temperature, wind, radiation, and surface fluxes at 50 m above ground level (a.g.l.). The local urban surface properties used for the reference experiment (Table 1) are based on existing literature for the KSSW site (Kotthaus & Grimmond, 2014a, 2014b and central London (Bohnenstengel et al., 2014b;Oikonomou et al., 2012). Other surface parameters used (not in Table 1) are (a) average building height, (b) aspect ratio, (c) roof width, (d) albedo, (e) heat capacity, (f) thermal conductivity, and (g) emissivity values for all facets. These are from SUBLIME Tsiringakis et al., 2019) and are available on the SUBLIME website (www.met.wur.nl/sublime).
The atmospheric forcing Tsiringakis et al., 2019) includes initial profiles for potential temperature, wind, mixing ratio (up to 17 km) and surface pressure. The 1-D WRF model initialization profiles are based on a radio-sounding (UWYO, 2012) at Herstmonceux, Hailsham, UK, at 00 UTC 23 July 2012 and are adjusted using the KSSW measurements and 3-D WRF model derived profiles over London. The adjustment with KSSW and 3-D WRF is done to ensure an accurate estimate of boundary temperature, moisture, and wind given the 70 km separation between KSSW and Herstmonceux.
Boundary conditions for the 1-D WRF model are applied in the form of subsidence, geostrophic wind, and advection tendency terms for potential temperature, moisture, and u and v wind components. Geostrophic wind is derived from 6-hourly ECMWF operational reanalysis data (ECMWF, 2012) in combination with a 3-D WRF simulations (hourly data) Tsiringakis et al., 2019). Geostrophic wind values are given as 6-hourly means, with a tendency term applied to the u and v geostrophic wind components at each time step (of 1-D WRF) to ensure a smooth change in geostrophic wind through the 6-hourly blocks and avoid oscillations from imbalance between actual and geostrophic wind speeds.
Advection for potential temperature, mixing ratio and momentum (u and v wind) are imposed throughout the 54-hr period as additional tendency terms in the prognostic equations for the specified variables. These tendency terms are derived from 3-D WRF model simulations for London (Tsiringakis et al., 2019). The hourly advection terms from 3-D WRF are averaged with advection estimates from WMO stations (NOAA/NCDC, 2012) within and around London. Six-hourly means are obtained from the hourly values. Thus a static advection is prescribed (i.e., independent of the 1-D WRF temperatures) that changes every 6 hr in the 1-D WRF simulation. Advection is treated as uniform throughout the observed height of the boundary layer, with a sharp linear decrease above. This new approach is preferred to avoid daytime stable stratification from a sharp decrease of negative temperature advection between 500 and 1,500 m, as in the original SUBLIME forcing Tsiringakis et al., 2019). The latter affects the distribution of TKE in the upper part of the boundary layer, thus impacting the boundary layer height and the temperature and moisture profiles.
Initial soil temperature and moisture content profiles (to 1.5 m depth), surface temperature for vegetation and urban surfaces are provided. They are derived from a 3-D WRF simulation and then cycled 3×2 days in an offline setup, until the deepest soil temperature became constant and storage heat flux shows a similar diurnal range for both days of the case study (Tsiringakis et al., 2019).

Model Description and Setup
Here we use the single-column version of WRF v3.8.1 (Skamarock et al., 2008). The urban surface is parameterized based on the SLUCM scheme (Chen et al., 2011;Kusaka et al., 2001) with the Noah-LSM scheme (Chen & Dudhia, 2001) representing the vegetated land-surface processes. SLUCM separates the urban surface into three facets (roof, road, and wall), each with their distinct sky-view factor based on urban morphological parameters. The turbulent sensible heat (Q H ) flux from each facet is calculated: (1) where ρ is the air density (kg m −3 ), c p is the specific heat capacity of dry air (J kg −1 K −1 ), and C H is the exchange coefficients for heat. Δθ (K) is the potential temperature difference between the surface and the air. U a is the wind speed (m s −1 ) at the first model level. The urban and vegetation fluxes are combined using a tiling approach based on their plan area fraction. The anthropogenic heat flux (Q f ) (added to the first model level) is prescribed with a diurnal cycle. For more details about SLUCM physical description and parameters see Loridan et al. (2010).
The surface layer and boundary layer are parameterized using Mellor-Yamada-Janjic (MYJ) schemes (Janjic, 1994). Radiative processes for long-wave and short-wave radiation are obtained from the RRTMG radiation  Note. Experiment naming uses the surface parameters tested and their value. See Table A1 for notation definitions.
schemes (Iacono et al., 2008), while for microphysics the WSM 3-class order scheme (Hong et al., 2004) is used. The model uses 70 vertical levels that extent to 17 km, with 25 levels within the lowest 1.5 km.

Strategy for the Sensitivity Tests
Sensitivity tests are conducted to address the three sources of uncertainty influencing model performance: (a) surface parameters, (b) atmospheric forcing, and (c) the differences in physical description in other essential parameterization schemes. The aim is to identify the effects these have on the surface radiation, energy fluxes, and turbulent intensity.

Surface Parameters
Based on Loridan et al. (2010) and Zhao et al. (2014), we identify urban fraction (f urban ), albedo (a), thermal conductivity (λ), heat capacity (C), the a kanda parameter, and the anthropogenic heat flux (Q f ) as the most influential surface parameters. Note that as Q f is prescribed in SLUCM it is treated as a parameter rather than a variable. For simplicity we investigate only the impact of roof albedo (a roof ), wall thermal conductivity (λ wall ), and heat capacity (C wall ) rather than all facets. Tsiringakis et al. (2019) found these to have the dominant impact on model performance in the current model configuration.
The a kanda parameter modifies the roughness length of heat (Zo hc ) from the one of momentum (Zo mc ) above the urban canyon and the overlying atmosphere (Kanda et al., 2007;Loridan et al., 2010) via, where Re * c is the Reynolds roughness number. Hence the a kanda primarily influences the ratio between sensible (Q H ) and storage (ΔQ s ) heat fluxes, with larger values decreasing Q H (and vice versa). It thus impacts the skin temperatures, surface flux partitioning, turbulent mixing in the surface layer, and the outgoing long-wave radiation (LW U ) due to lower ΔQ s . Q f is a very uncertain parameter as in reality it is highly variable with strong dependence on anthropogenic activities (Bohnenstengel et al., 2014;Dong et al., 2017;Iamarino et al., 2012). For London values can vary between 10 and 140 W m −2 , depending on the location and area extent, with estimates reaching 200 W m −2 in central London (Iamarino et al., 2012) but small when averaged over larger areas (Dong et al., 2017;Lindberg et al., 2013). Ward et al. (2016) estimate Q f for this study area to vary temporally between 20 and 80 W m −2 . Considering how important the anthropogenic heat flux can be for the surface energy balance (Best & Grimmond, 2016) it is essential to include it in our analysis. Reference values, incremental change, and minimum/maximum variation limits for the surface parameters (Table 1)

Atmospheric Forcing
The atmospheric effects investigated are the impact of radiation, advection of heat and moisture, and turbulent mixing intensity. Short-wave radiation biases in cities are common even in clear-sky conditions (Tsiringakis et al., 2019) by WRF, especially due to aerosol loading effects on direct short-wave radiation (Gomes et al., 2008;Kokkonen et al., 2019). Hence we modulate the aerosol optical depth (AOD) to simulate effects on radiation between clearer and more polluted atmospheric conditions. Terra/MODIS satellite derived AOD (Levy et al., 2013) range between 0.1 and 0.4 (0.18 in the reference run) depending on the distance from central London and the timing of the measurement (NASA/EOSDIS, 2019). Long-wave radiation biases in models participated in GABLS (including WRF) (Bosveld et al., 2014;Kleczek et al., 2014) suggest that long-wave downward radiation (LW D ) biases are caused by bias in boundary layer temperature and humidity and different physical complexity of the radiation models. Using an extreme CO 2 uncertainty range (38-3,800 ppm) we increase and decrease the long-wave downward (LW D ) radiation without changing directly air temperature or humidity. This allows us to further disentangle the atmospheric driven uncertainty in LW D from the physical driven one.
Heat (ADV θ ) and moisture (ADV q ) advection are inherently difficult to estimate correctly especially in heterogeneous environments. Yet they strongly impact the near-surface temperature and the surface energy balance (Heaviside et al., 2015). ADV θ and ADV q forcing are both positive and negative in the study period. Multiplication factors (Table 2) are applied to the negative advection values. Positive advection values receive the inverse multiplication factor for these tests (e.g., when ADV θ < 0 values are multiplied by 1.25, positive values are multiplied by 0.8 and vice versa).
To investigate the impact of turbulent mixing intensity we modify the exchange coefficients for heat (CH heat ) and momentum (CH mom ) for the urban, surface layer, and boundary layer schemes. These coefficients are linearly (Table 2) enhanced or decreased to modify the coupling between the surface and overlying atmosphere. The turbulent mixing impacts the performance of the surface layer scheme (Bosveld et al., 2014;Sterk et al., 2013) and can explain much of the biases (i.e., long-wave radiation, near-surface temperature, and surface energy fluxes) between model results and observations. Here we investigate fully convective boundary layers, whereas previously neutral and stable boundary layers have been studied. Hence our range of the multiplication factors for the exchange coefficients (0.67-1.50) is smaller than that used by Sterk et al. (2013) and Bosveld et al. (2014) (0.25-4).

Physical Ensemble Tests
Variability in model performance is also caused by the model physics used to parameterize subgrid scale processes. The PILPS urban  and the GABLS (Bosveld et al., 2014;Kleczek et al., 2014) studies have investigated this variability in model performance for different UCMs, boundary layer, and surface layer schemes. Using different NWP models (as seen in GABLS) complicates the analysis of model differences, as physics schemes also vary, thus adding uncertainty.
Using WRF, we can vary the individual physics schemes, while keeping others unchanged giving us a "physical ensemble" that enables us to test the third source of uncertainty (physics description). Here we consider the radiative transfer, boundary layer, and UCM schemes available in WRF (Table 3).
For the urban surface, BEP and BEP + BEM multilayer urban canopy models are tested (Martilli et al., 2002;Salamanca & Martilli, 2009) with SLUCM (Kusaka et al., 2001) being the reference choice. For the boundary layer we test the Yonsei University scheme (YSU) Hong et al. (2006) and Quasinormal Scale Elimination (QNSE) Scheme (Sukoriansky et al., 2005) both coupled to WRF-SLUCM-Noah setup. The radiation scheme is changed to the CAM short-wave and long-wave radiation schemes (Collins et al., 2004).

Analysis of Process Diagrams
Process diagrams (Bosveld et al., 2014;Sterk et al., 2013) allow the impact of atmospheric forcing, surface parameters, or parameterization schemes changes to be explored relative to a control run for a pair of variables. Each model run is represented by the mean of the two variables under investigation for a specified period (e.g., daytime or nighttime). We link these points to identify if the response is linear or nonlinear. Model sensitivity tests allow identification of the dominant influences on the pair of variables under investigation, within the ranges of the perturbed parameters and atmospheric forcing components (Tables 1, 2, and 3). The mean observations are shown to help explain differences between model and observations. Following Bosveld et al. (2014) we use four perturbations and the reference model run to capture changes in parameters and atmospheric forcing components (including drawing the sensitivity lines), but we only show the maximum and minimum limits as points in the figures. Sterk et al. (2013) and Bosveld et al. (2014) suggest that it is essential to identify variables combinations that are coupled/interdependent and are linked to the physical processes under investigation. Given the observations available we focus on (a) surface radiation balance, (b) surface energy flux partitioning, and (c) turbulent mixing separated by the time of day (based on surface net radiation Q * ) into day (Q * >0 W m −2 ) and night (Q * <0 W m −2 ) (Best & Grimmond, 2015), allowing the analysis of results under strong and weaker turbulent mixing regimes, when different physical processes dominate. CAM Note. Acronyms are explained in Appendix A1. Note. Aerosol optical depth (AOD) and CO 2 concentration values change, whereas the remainder use a multiplication factor.

Model Evaluation for the Reference Model Setup
Here we briefly evaluate the 50 m air temperature and net all-wave radiation (Q * ). For a more extensive evaluation see Tsiringakis et al. (2019).
For the entire evaluation period (06:00 UTC 23 to 06:00 UTC 24 July) the reference run (R1) has a mean bias error (MBE) of 0.11°C (Figure 1). The modeled T 50m is cooler during daytime (MBE = −0.55°C) and warmer at night (MBE = 0.90°C) compared to the observations, suggesting the simulated temperature range is important. Therefore, the changes in temperature (ΔT 50m = T 50m,max −T 50m,min ) during daytime and nighttime is also considered. The observed daytime ΔT 50m is 9.85°C and 9.52°C when modeled (Figures 1 and 3). A nighttime MBE of 0.75°C exists between the modeled ΔT 50m (6.26°C) and observed (7.01°C).
For Q * the reference run model has an MBE of −23.3 W m −2 originating from −43.6 W m −2 day MBE and a 0.7 W m −2 night. The daytime bias originates from short-wave downward radiation SW D (MBE = 14.3 W m −2 ), short-wave upward radiation SW U ( MBE = 24.8 W m −2 ), long-wave downward radiation LW D (MBE = −16.1 W m −2 ) and long-wave upward radiation LW U (MBE = 20.9 W m −2 ) (Figure S1) biases. The nocturnal bias in Q * originates from LW D and LW U biases of −8.1 and −8.8 W m −2 , respectively.

Surface Radiation Balance
From the sensitivity analyses the bias in SW D is caused by aerosol optical depth (AOD), as all other parameter changes do not decrease SW D . An AOD of 0.27 reduces the bias in SW D . Terra/MODIS AOD data (Levy et al., 2013) for the study period (not shown) indicate a sharp increase of AOD from 0.15 (outskirts of London) to 0.25 (central business district, CBD). However, improving the SW D estimate increases Q * MBE (to −52 W m −2 ) and net short-wave (S * ) at the surface. As the bias originates primarily from SW U , we can attribute this to the bulk albedo of the urban surface. By decreasing a roof to 0.12 we decrease the surface albedo ( Figure 2b) and reduce the Q * MBE (to −40 W m −2 ). The bias in bulk albedo originates from the 2-D canyon physical description in SLUCM (Tsiringakis et al., 2019).
The remaining bias in Q * is from the net long-wave radiation (L * ). The daytime bias in the LW D is only partially explained by uncertainty in ADV θ and ADV q (∼4-5 W m −2 , Table 4a). Large changes in CO 2 , not supported by observational data, would be sufficient to account for this bias. However, it is more likely that the bias originates from a negative bias in mid-to-upper boundary layer temperature and moisture, but this cannot be verified with the existing observations. The LW U biases are attributed to four different sources (CH heat , a kanda , c wall , and λ wall ), with the first two being the stronger contributors. CH heat and a kanda alter While correcting the bias in Q * is arbitrary, it does not lead to better model performance in general. The response of ΔT 50m for a given change in Q * strongly depends on which atmospheric forcing or surface parameter is modified (Figure 3).
During the day, changes in ADV θ have small effect in Q * , but strongly impact ΔT 50m (1.8 K difference). The lack of variability in Q * is from compensating changes in LW D and LW U (Table 4a) indicating that the boundary layer is in radiative balance with the surface, despite the drastic temperature change. An opposite effect is observed for changes in AOD, where the large decrease in Q * (up to −37 W m −2 from 0 to 0.36 AOD) are not followed by a change in ΔT 50m . This is caused by the increase in radiative heating due to SW D absorption at higher aerosol concentrations, which compensates for the decrease of air temperature by decreasing Q H . Changes in CH heat and CH mom have positive Q * −ΔT 50m relations with strong radiative heating at higher Q * and vice versa (Figures 3a, 3b, S7a, and S8a). The change in ΔT 50m can be attributed to the changes in Q H , which are caused by the changes in CH heat (Equation 2).
Variations in SLUCM surface parameters have different impacts. Changes in a roof and a kanda , like CH heat and CH mom , have a positive trend (Figures 3b, S3a, S7a, and S8a), with a roof having larger impact on Q * (up to 40.4 Wm −2 ), whereas a kanda has more impact on ΔT 50m . The a roof and a kanda increase Q * (and consequently Q H ), thus increasing ΔT 50m as well. However the effect of a kanda on Q H is also from changes in the roughness length of heat (Equation 1), which also impact the exchange coefficient of heat (Equation 2).
The effects of changing f urban ,C wall , and λ wall cause an increase in ΔT 50m with a decrease in Q * (Figure 3b). LW u increases with smaller C wall , λ wall , and larger f urban values (Table 4a). ΔT 50m increases with decreasing C wall and λ wall as ΔQ s is decreased, hence increasing Q H , while increasing f urban changes the vegetation fraction, thus decreasing the Q E and consequently increasing Q H . Changes in Q f and ADV θ both have the same orientation (Figures 3a and 3b) but different magnitude, with Q f not really affecting ΔT 50m during the daytime.  (Table 1) and atmospheric forcing components (Table 2) only model runs using minimum (open symbol) and maximum (filled symbol) limit of the uncertainty range are shown, with lines between through all five runs.

Journal of Geophysical Research: Atmospheres
At night changes in ADV θ , ADV q , and CO 2 impact both Q * and ΔT 50m (Figure 3c). However, changes in ADV θ influence ΔT 50m more, while ADV q mostly affects Q * , by increasing LW D radiation due to more water vapor in the boundary layer. The prescribed CO 2 concentration changes result in substantial changes in nocturnal Q * (>10 W m −2 ) and ΔT 50m . Increasing CH heat and CH mom result in an increase in ΔT 50m at night, which counteracts the daytime CH heat effects (Tables 4a, 4b, and Figure S8). The increase in nighttime ΔT 50m is due to a decrease in daytime ΔQ s (Figure 4a), which leads to smaller release of heat during the night ( Figure 5a) and thus more radiative cooling. Consequently nocturnal air temperatures are lower in experiments with high CH heat ( Figure S8).
Changes in Q f and a kanda strongly influence nocturnal ΔT 50m (similar to CH heat ) but cause minimal variation in Q * , given the radiative balance between LW U and LW D (Figure 3c and Table 4b). Increasing a kanda leads to a small decrease in nocturnal Q * due to the decrease in LW D , a result of the lower air temperature due to less Q H during the day. C wall and λ wall have a nonlinear behavior at nighttime, because of heat saturation effects of the urban fabric (Tsiringakis et al., 2019, and section 3.3). Both have a large impact on nocturnal ΔT 50m but also small effect (up to 3 W m −2 ) on Q * . Increasing f urban results in the same decrease in Q * from higher LW U but only a minor decrease in ΔT 50m . This suggests that nighttime cooling rates surprisingly do not show a strong response to changes in f urban . This can be explained by the strength of nocturnal radiative cooling. As discussed section S2.2, the increase in mean daytime air temperature due to higher f urban leads to similar (or even stronger) cooling during the night (Figures S6d, S8a, and S8b). This nonlinear feedback involves an increase in daytime T 50m followed by an increase in nighttime surface to air temperature gradient, thus leading to stronger atmospheric stability and more radiative cooling ( Figures S7-S10). Changes in f urban are not the only triggering mechanism. It exists for most surface parameters and atmospheric forcing components that we investigated. Its impact is largest with lower wind speeds during the second day of the case study (section S2.3).  For the radiative balance and its effects on radiative heating and cooling, we find five sources of uncertainty in this case study accounting for a large part of the bias between model and observations. As discussed, correcting AOD and a roof can explain the bias net-short-wave radiation and some of the bias in daytime ΔT 50m . Decreasing a kanda and increasing CH heat effectively removes heat faster from the urban surface, thus decreasing skin temperature and LW U flux, while increasing ΔT 50m in both day and night. Finally a small increasing in the intensity of ADV θ can compensate the ΔT 50m during the day. Some bias still remains in Q * and is primarily associated with the bias in LW D and some remaining bias in LW U .

Energy Partitioning
Analysis of the surface energy partitioning is essential to understand the overall impact on atmospheric forcing and surface parameter changes in ΔT 50m . It provides further insight in compensating effects between day and night.
Regarding changes in atmospheric forcing during the daytime, AOD and CO 2 effectively maintain the same energy partitioning ratio (1.55 to 1.59) for both positive and negative changes in Q * . Consequently any energy gain or loss at the surface is distributed equally between the Q H and ΔQ s fluxes (Figure 4a). Changes in CH heat and CH mom impose a variation in energy partitioning, with lower values leading to faster decrease in Q H and increase in ΔQ s . The variation here is primarily caused by the response of Q H due to changes in CH heat (Equation 2). Q E also decreases, but the flux variation from changes in CH heat is an order of magnitude smaller than for Q H (5 vs. 40 W m −2 ). The decrease in Q H results in less heat directed toward the atmosphere and more heat stored in the urban fabric, thus decreasing Q * because of lower LW D and larger LW U (Table 4a). Changing advection has similar effects, but with lower variation in energy partitioning and no effect on Q * due to the net long-wave radiative compensation.
Surface parameter changes have a wider impact in the parameter space (Figure 4b) compared to the change in atmospheric forcing, for the current choice of uncertainty range. Decreasing a roof increases the Q H /ΔQ s flux ratio, with faster increase in Q H compared to ΔQ s due to thermal saturation effect on the roof facet. Thus, for the same change in Q * there is a larger variability in ΔT 50m compared to changes in AOD (Figure 3b). Changes in a kanda have a nearly identical response as CH heat for the same reasons. The same occurs for changes in Q f and ADV θ , but despite their similar response in Q * and energy partitioning, each parameter affects the ΔT 50m differently during the day and night. C wall and λ wall increase the Q H /ΔQ s ratio with decreasing parameter values due to higher skin temperature and higher LW U .

Journal of Geophysical Research: Atmospheres
Although some atmospheric forcing (CH heat and CH mom ) and surface parameters (C wall and λ wall ) cause a large variation in flux ratio and ΔT 50m , their impact on T 50m is small. This can be explained by the opposite effects on T 50m between night and day ( Figure S8). For instance an increase in CH heat leads to an increase in Q H during the day but also decreases Q H at nighttime (Figures 4a and 4c). Figure 5 shows that heat supply from the urban fabric at night is lower due to less ΔQ s stored during the day as a result of the increase in CH heat . These compensating effects are what limits the change in T 50m . Whereas parameters like Q f and ADV θ have smaller variations in the Q H /ΔQ s flux ratio but have strong impact on ΔT 50m both during day (ADV θ ) or night (Q f ). Hence, both affect the day and nighttime T 50m due to strong effects on their temperature that propagate from day to night and vice versa ( Figures S8 and S9).
Following Tsiringakis et al. (2019) we do not derive the storage heat as the residual of surface energy balance, because of the accumulation of errors (Grimmond & Oke, 1999) and mismatch between the measurement footprint of the turbulent fluxes and the radiation fluxes (Schmid et al., 1991). Thus, we do not derive observed flux ratio during the day. However, the nocturnal Q H /ΔQ s ratio is strongly impacted by the sign and value of Q H , given the difference between modeled (0.15 W m −2 ) and observed (8.74 W m −2 ) nocturnal Q H , and the plausible range of nocturnal ΔQ s (−50 to −100 W m −2 ). At nighttime we use the objective hysteresis model  with the same coefficients as Ward et al. (2016) for this site forced by the observed Q * to derive an "observed" nocturnal ΔQ s (−74.3 W m −2 ), leading to an "observed" Q H /ΔQ s estimate of −0.12. Although this is not an observation it allows exploration of how atmospheric forcing or surface parameters might explain the bias between model and observations.
At night the Q * bias between model and observations is small (3 W m −2 ), but the flux ratio is much smaller in the model because Q H is nearly zero at night in the reference run. Increasing ADV θ and Q f or decreasing CH heat and a kanda can decrease the nighttime bias in the flux ratio. However, none can consistently correct biases in Q H /ΔQ s and ΔT 50m at the same time.

Intensity of Turbulent Mixing
The intensity of turbulent mixing, indicated by the bulk Richardson (Ri b ) number (between the 1st and 2nd model levels) aids in understanding the coupling between the urban surface and the overlying boundary

Journal of Geophysical Research: Atmospheres
layer. This can help us understand how the changes in the flux partitioning affect this coupling and how is this translated to the changes we identify in ΔT 50m .
Most of the changes in atmospheric forcing ( Figure 6a) and surface parameters (Figure 6b) limit the variability of the Ri b between −0.71 to −0.58 during the day, with daytime changes in AOD, CO 2 and ADV q having small impact on Ri b , and minimal effect in ΔT 50m . However, CH heat causes large changes in Ri b and ΔT 50m . Smaller increase ΔT 50m occurs at lower Ri b values and weaker turbulent intensity, while the opposite is true for higher Ri b values. This is somewhat counter-intuitive, because an increase in Q H with larger values CH heat would have increase the turbulent intensity and ΔT 50m . However, the increase in CH heat results in heat being transported more rapidly from the surface into the boundary layer, warming up the entire boundary layer and reducing the temperature gradient near the surface, driving Q H and Ri b down during the day ( Figure S10). Changes in ADV θ cause a small decrease in Ri b for an increasing in ΔT 50m (Figures 6a, S6a, and S6b). This is primarily caused by the increase in near-surface temperature gradient as colder air is advected above the warm surface.
Surface parameter changes show a clear influence on the ΔT 50m −Ri b variable space, with a 0.2°C to 2.5°C change in ΔT 50m for a 0.1 change in Ri b (Figure 6b). The a kanda , a roof , and Q f have the largest impact on Ri b during the day. These changes in Ri b and ΔT 50m primarily originate from a change in Q H .

Journal of Geophysical Research: Atmospheres
At night (Figure 6c) ADV θ , CO 2 , and CH heat show the largest impact on Ri b and ΔT 50m . Changes in ADV θ cause an increased ΔT 50m for weaker stability near the surface (smaller Ri b ). The decrease in atmospheric stability is caused by the decrease in near-surface temperature gradient, from the faster decrease in T 50m compared to changes in skin temperature, when strong negative ADV θ is applied (Table 4b). This response when ADV θ increases is from a stability feedback mechanism. This results in less radiative cooling and weaker atmospheric stability when daytime ΔT 50m decreases. Under low wind conditions, the impact of the feedback mechanisms intensifies and results in lower nocturnal ΔT 50m compared to the reference run ( Figure S6c). The remaining atmospheric forcing modification or surface parameter changes shows an increase in nocturnal cooling rate, coinciding with also a faster heating rate during the day. Strong radiative cooling after sunset is the main driver for the stronger atmospheric stability in the experiment runs at night (Figures 6c and 6d).
Nocturnal surface parameter changes have a large impact on Ri b and ΔT 50m , with changes in Q f and λ wall showing nonlinear responses during the night. Changes in C wall ,λ wall , and a kanda have a strong impact in ΔT 50m , which compensates for the larger daytime increase in ΔT 50m (Figures 6b and 6d). This compensation leads to similar nighttime T 50m for these surface parameters ( Figure S8a). Nocturnal ΔT 50m is strongly dependent on the radiative cooling and the atmospheric stability near the surface; thus surface parameters which increase daytime ΔT 50m result in stronger stability during the night. These findings support our hypothesis that a negative feedback mechanism exists between daytime ΔT 50m and nocturnal ΔT 50m .

Radiation Schemes
Using the CAM short-wave radiation scheme, instead of the reference RRTMG, increases the SW D bias to 51 W m −2 , while reducing the Q * bias to −17 W m −2 (Figure 2a). The difference between the schemes aligns with the changes in AOD, which indicates that CAM sw in WRF does not fully account for scattering by aerosols. This physical difference will mask the biases in net long-wave radiation when the CAM sw is used, making it more difficult to identify remaining biases, unless each radiation flux component is treated individually. Comparing CAM lw and RRTMG lw an increase in daytime LW D bias of 8 W m −2 is identified, causing an increase in Q * bias. The CAM lw aligns with the impact of a decrease in CO 2 concentration. That does not indicate an absence of CO 2 effects on LW D in CAM lw but is related to a different physical description of long-wave radiation between the schemes and it could originate from not accounting fully for water vapor effects on LW D and from the few spectral bands used compared to the RRTMG scheme. The exact same biases from CAM sw and CAM lw are seen throughout Figures 3a, 3c, 4a, 4c, 5, and 6 indicating the importance of the radiation scheme choice.

Boundary Layer Schemes
Changes in atmospheric forcing do not clearly explain daytime deviations between the different boundary layer parameterization schemes used (YSU, QNSE, and the reference MYJ). Although initially ADV θ appears to be the primary difference between YSU and the other two schemes, this is misleading as advection is the same for all. Changes in CH heat and CH mom cannot explain the variations, neither can changes in CO 2 or AOD. Considering that runs with different boundary layer schemes show small variation in Q * (Figure 3a), but large variation in ΔT 50m , the differences should be primarily driven by the physical description of turbulent processes. Indeed for the daytime the main difference between YSU and MYJ seems to be the explicit inclusion of entrainment of heat in the YSU scheme and a slightly enhanced surface Q H , potentially due to the different surface layer scheme YSU is coupled to. Moreover, decreasing ADV θ does produce the same effect as the explicit inclusion of entrainment, because both increase mean boundary layer temperature (consistent with the larger ΔT 50m for YSU) and boundary layer height in YSU (not shown). Thus, additional attention should be given not to misinterpret compensating effects from these two physical processes. At night the YSU scheme impact is similar to effects of increased CH heat (Figure 3c). This is expected as increased CH heat indicates stronger nocturnal stability via enhanced radiative cooling due to a stability related feedback mechanism (sections 3.4 and S2.3).
Differences in MYJ and QNSE are minimal for Q * (Figures 3a and 3c) and the Q H /ΔQ s partitioning (Figures 4a and 4c) at both time periods. However, substantial differences occur between them in ΔT 50m and stability (Figure 6a and 6c). Runs with the QNSE schemes have a smaller diurnal cycle of temperature and instability, reduced buoyancy flux during the day, and weaker stability at night. The nocturnal deviation from MYJ is caused by stronger wind shear in the QNSE schemes, which appears to be related with lower CH heat and CH mom at nighttime that reduce the radiative cooling at the surface.

Urban Canopy Models
The different model results from SLUCM, BEP + BEM, and BEP model can be explained reasonably well with the surface parameters differences. SLUCM can match the daytime BEP + BEM results reasonably well by decreasing a roof (to 0.12) and a kanda (to 0.80) and increasing C wall (to 1.8*10 6 J m −3 K −1 ). Decreasing the SLUCM a roof is consistent with the lower bulk albedo for the BEP + BEM and BEP models, which have different physical description of the urban morphology and shading. The a kanda change increases the modeled 2 m exchange coefficient of heat for SLUCM, which is lower than in BEP + BEM. The increase in SLUCM C wall increases the ΔQ s and can compensate the lack of a building energy model and air-conditioning cooling (present in BEP + BEM), which increases the heat capacity of the urban fabric as internal building temperatures remain lower. Exclusion of the BEM module in the multilayer scheme leads to substantially larger Q H and lower thermal storage. To match BEP with SLUCM in addition to the changes in a kanda and a roof are needed as well as, a decrease in λ wall (to 0.45) and a reduction of Q f to 0 W m −2 (BEP does not account for Q f ).
These modifications reduce the differences between SLUCM, BEP + BEM, and BEP, causing a daytime ΔT 50m difference reduction to 0.20°C (from 0.75°C) and a 0.06°C reduction at night (from 0.25°C). Q * is only improved during the day and becomes identical to BEP + BEM with large reduction in the differences between the SW U and LW U between the two schemes. The same is true for Q H /ΔQ s and the Ri b , which indicates more similarity in the surface fluxes and the near-surface atmospheric stability. However, at night substantial differences remain in Q * , due to higher skin temperatures in SLUCM and also in Q H /ΔQ s as Q H in SLUCM is lower compared to BEP + BEM.

Discussion
This analysis identifies the model response to changes in atmospheric forcing and urban surface parameter for a specific model configuration (section 2.2) and for a specific UCM (SLUCM). The same sensitivity analysis with different model configurations (e.g., different NWP or reference parameterization schemes) and UCMs are anticipated to lead to different model responses. This is to be expected due to differences in model setups and UCMs. Hence, we recommend other UCMs with more complex (i.e., multilayer schemes) and more simplified (e.g., bulk schemes) physical description of the urban surface to be tested. Similarly different case studies, urban areas, and predominant meteorological conditions (e.g., cloudy/rain period, different geostrophic wind speeds) need to be considered.
To ensure our analysis is not day-specific, we compare the model responses between the first and second days of the SUBLIME case study. During the second day geostrophic wind speed is substantial lower; thus sensitivity of the model's response to geostrophic wind is also tested. The model response remains similar between the 2 days, with the few differences linked to the nonlinear feedback between daytime T 50m and nocturnal atmospheric stability, causing nonlinear behavior for some variables due to the generally larger Ri b during the second day, an effect of the lower wind speeds.
The plausible uncertainty range in atmospheric forcing and surface parameters (as presented in section 2.3) is based on previous reported sensitivity tests and uncertainty estimates (Loridan et al., 2010;Wang et al., 2011;Zhao et al., 2014). It could however undersample or oversample the uncertainty range for specific parameters leading to skewed model sensitivity. It is expected that such an effect would be more profound on the actual range of the sensitivity and not so much on the orientation or linearity of the model's response in the parameter space. However, under low wind conditions the response for some parameters might be nonlinear due to stability effects. Therefore, careful selection for the uncertainty range and the frequency of sampling from the parameter range is essential. For CH heat and CH mom , changes in other atmospheric forcing or surface parameters will affect the calculated values, but their effect is small compared to changes in the multiplication factors.
The 1-D WRF-Noah setup and the boundary layer scheme influence the outcome of the sensitivity analysis.
In reality a boundary layer will react differently to changes in physical properties of the underlying surface. For example, any variation in the urban surface temperature over the urban area will likely result in a change of temperature advection and also wind speed due to changes in pressure gradient. Such processes cannot be represented in a 1-D where advection is prescribed but will require 3-D simulations. Such limitations in the representation of the overlying atmosphere might exclude feedback mechanisms that could change the sensitivity reported in this study.
Following Tsiringakis et al. (2019), we used the LW up calculated in the SLUCM rather than the WRF reference as the LW up from the long-wave radiation scheme uses an average aerodynamic surface temperature instead of the radiative skin temperature of the facets. SLUCM calculates this aerodynamic surface temperature diagnostically from the air temperature, Q H and the modeled exchange coefficient of heat. During our analysis we found that this aerodynamic surface temperature varies substantial from the radiative skin temperature of the urban facets. Since the BEP + BEM and BEP schemes use a similar approach to calculate the LW up (i.e., via the radiative skin temperature) we used the same approach for WRF-SLUCM-Noah. Note that this difference does not affect Q * at the surface, as this LW up is from the urban scheme, using the radiative temperature.
This study does not optimize the model performance but tries to understand how uncertainty in forcing and parameters affect key physical processes in the urban surface and overlying atmosphere. However, if one is mainly interested in improving model performance through optimization there are a series of potential techniques to do so. Loridan et al. (2010) and Zhao et al. (2014) used ensemble Kalman filtering and Monte Carlo approach to optimize the urban surface parameters in order to improve the SLUCM' s performance, with very promising results. For offline or even limited 1-D model simulation that might be computationally possible, but for full 3-D model simulations the computational cost might be prohibitive. Inverse modeling with the adjoint model of WRF (Zhang et al., 2013) might be more efficient.
Here we use modeled ΔT 50m rather than the modeled T 50m (following Sterk et al., 2013) to minimize the impact of biases introduced during spin-up phase and to allow analysis of model response to changes in atmospheric forcing and surface parameters separately for day and night. Thus we are able to easily distinguish compensating effects on T 50m between day and night (e.g., changes in λ wall and CH heat ) or carryover effects from day to night (e.g., ADV θ and Q f ). Moreover, the changes in ΔT 50m during day or night are directly linked with the changes in radiation and surface energy balance. This allows us to identify more accurately, which atmospheric forcing or surface parameter changes explains the bias between model and observations or differences between different model setups.

Conclusions
In a coupled NWP-UCM model setup, surface parameters and atmospheric forcing are the primary sources of uncertainty and strongly affect model performance. With WRF-SLUCM-Noah we investigate the impact of these sources during a 1-day clear-skies period in London. The impact of change in atmospheric forcing and surface parameters to the surface radiative balance, energy partitioning, and intensity of turbulent mixing are calculated together with the coupling between the surface and the overlying atmosphere.
Both atmospheric forcing and surface parameters changes impact the model's performance. For the radiative balance, AOD, a roof ,CO 2 , and a kanda are the most influential parameters, each impacting different terms. SW U and LW U cause the bias in modeled Q * . Correcting for the radiative bias improves the radiative heating and radiative cooling performance. For the surface energy flux partitioning, the model has the largest response to changes in CH heat ,a kanda ,Q f ,C wall , and λ wall . Changes in near-surface atmospheric stability are comparable in magnitude from most changes in atmospheric forcing and surface parameters, with different orientation for each of the sources. Changes in CH heat have the largest impact on daytime Ri b , while ADV heat , a kanda , C wall and λ wall are more critical at night. A feedback mechanism between increasing daytime T 50m and increase in nocturnal radiative cooling is identified. Its intensity is strongly depended on the wind shear.
between CH heat and a kanda , which are linked to the surface atmosphere coupling. Both AOD and a roof impacts on Q * and surface flux partitioning are similar. The reported compensating effects, evident from single variable analysis, are reduced by using the 2-variable space analysis (the so-called the processes diagrams). Various atmospheric forcing and surface parameter changes have similar effects, if not separated by time of day. This supports the Best and Grimmond (2015) suggestion to analyze model responses under different turbulence regimes.
We highlight that it is possible to identify physical description differences between the schemes used in our WRF-SLUCM-Noah setup. This is easier when key physical mechanisms are missing from a scheme (e.g., lack of aerosol effects on SW D ) but is more difficult when the process is not covered by the atmospheric forcing range of analysis (e.g., explicit entrainment flux in YSU compared to MYJ). This analysis can also identify compensating effect between atmospheric processes, as demonstrated by the fact that decreasing ADV θ shows a similar effect to including explicit entrainment. We could identify differences in the UCM schemes through the use of changes in the surface parameters and link them to differences in the physical complexity of the schemes, which allows to link uncertainty to either changes in atmospheric forcing or surface parameters. However, this approach has clear limitation (e.g., inability to explained the difference in nocturnal Q * between SLUCM and BEP + BEM).