Time‐Lapse Photogrammetry of Distributed Snow Depth During Snowmelt

Characterizing snowmelt both spatially and temporally from in situ observation remains a challenge. Available sensors (i.e., sonic ranger, lidar, airborne photogrammetry) provide either time series of local point measurements or sporadic surveys covering larger areas. We propose a methodology to recover from a minimum of three synchronized time‐lapse cameras changes in snow depth and snow cover extent over area smaller or equivalent to 0.12 km2. Our method uses photogrammetry to compute point clouds from a set of three or more images and automatically repeat this task for the entire time series. The challenges were (1) finding an optimal experimental setup deployable in the field, (2) estimating the error associated with this technique, and (3) being able to minimize the input of manual work in the data processing pipeline. Developed and tested in the field in Finse, Norway, over 1 month during the 2018 melt season, we estimated a median melt of 2.12 ± 0.48 m derived from three cameras 1.2 km away from the region of interest. The closest weather station recorded 1.94 m of melt. Other parameters like snow cover extent and duration could be estimated over a 300 × 400m region. The software is open source and applicable to a broader range of geomorphologic processes like glacier dynamic, snow accumulation, or any other processes of surface deformation, with the conditions of (1) having fixed visible points within the area of interest and (2) resolving sufficient surface textures in the photographs.


Introduction
Historically, snow depth time series refers to a point measurement repeated in time at a particular location realized with a probe, an ultrasonic ranger, or a laser. With the advent of technology such as GPS, laser scanner (Deems et al., 2013), or digital photogrammetry (Bühler et al., 2016;Marti et al., 2016;Nolan et al., 2015) it is now possible to acquire detailed, but sporadic, spatial information about snow depth at horizontal scales ranging from centimeters with ground-based laser scanner to 100 m or more with sensors on satellite platforms. In windy environments, snow depth is highly variable (Liston, 1999) and remains a challenging variable to predict (Grünewald et al., 2013;Li et al., 2018). Snowmelt is a dynamic system important to climatic, ecological, and societal processes. In mountainous and northern regions, snowmelt releases a large amount of the annual freshwater available to ecosystems, agriculture, and hydropower production. It is also the time during which the landscape albedo drops abruptly. Having coupled temporal and spatial information of snow depth during the accumulation and ablation seasons could (1) improve our understanding of deposition and ablation processes and (2) improve estimates of snow water equivalent in a given region (Margulis et al., 2019). While Picard et al. (2016) successfully built a custom-made laser scanner for this purpose, we propose an alternative method based on time-lapse photogrammetry, also called videogrammetry (Gruen, 1997), to resolve the evolution of snow depth distribution with time.
Time-lapse photography is used extensively for monitoring the dynamics of seasonal snow since the 1970s (Parajka et al., 2011). By placing stakes within the field of view of the camera, one can measure snow height. If the same camera is pointed toward a vegetation canopy, it becomes trivial with basic image processing to extract a relative change of snow interception by canopies (Dong & Menzel, 2017;Garvelmann et al., 2013;Parajka et al., 2011). In Zackenberg, Greenland, Hinkler et al. (2002) used images from a time-lapse camera to estimate the snow cover extent throughout melt and proposed an alternative to the Normalized Difference Snow Index adapted to consumer grade cameras that only have three spectral bands (red, green, blue: RGB). The picture radiometric information was later used to estimate snow surface albedo (Corripio, 2004). In 2010, Farinotti et al., 2010 estimated the snow water equivalent of an alpine catchment by combining knowledge from the local topography, a simple melt model, and snow cover extent from an oblique 10.1029/2018WR024530 time-lapse camera. Härer et al. (2013) automated the classification and reprojection processes. Finally, time-lapse imagery can also provide information about snow surface morphology evolution (Kochanski & Anderson, 2018), relying on changes of the image texture. In each of these cases, accessing the third spatial dimension would provide direct estimation of snow depth and other morphometric changes.
With the affordability of digital photography, and the development of photogrammetric software, techniques that once required a lot of manpower (Smith et al., 1967) can nowadays be mostly automated. In 2015, Nolan et al. (2015) assessed the possibility of mapping snow depth with a consumer grade camera from repeated aerial surveys. In 2016, Bühler et al. (2016) and Harder et al. (2016) used the same concept from an Unmanned Aerial System, and Marti et al. (2016) derived snow depth distribution map with imagery from the Pleiades satellites. Lacking the possibility to repeat at high frequency such aerial survey because of manpower, cost, or weather, a ground-based solution like the one developed by Blyth et al. (1974) can nowadays be automated.
Within the last few years, we counted at least six reports of time-lapse photogrammetry studies for geomorphological processes presented in articles or conferences (Abou Chakra et al., 2017;Eltner et al., 2017;Galland et al., 2016;James & Robson, 2014;Kromer et al., 2016;Mallalieu et al., 2017). From the most comprehensive ones, the main key point of success for this method is to define accurately ground control points (GCPs) within the scene of interest and identify them in the subsequent images. Because cameras are prone to small movements (e.g., temperature dilatation, frost heave, or thaw settlement), it is crucial to detect and estimate any motion of the images with time. Eltner et al. (2017) used premade and recognizable GCPs in small scenes (few meters wide), automatically identifiable by their photogrammetry software. This technique becomes problematic when the size of the scene increases and is at risk of being covered with snow. This is why Mallalieu et al. (2017) used natural landmarks as GCPs. However, in this latter study, the identification of GCPs was fully manual and could not be scaled to long time series of images. We present, in this technical report, an algorithm to automatically derive 3-D models from a set of at least three stacks of image time series. After an initial definition of the GCPs on a reference set of images, tie points across all images are automatically detected with the Scale-Invariant Feature Transform (SIFT) algorithm. Unlike most photogrammetry studies applied to snow, our software is fully open source. We developed and tested our method in the field at Finse, Norway, during the 2017 and 2018 melt seasons.

Methodology
The concept is to acquire simultaneously a minimum of three pictures of a scene from slightly different viewing angles to compute 3-D models of the topography, which can later be differentiated to estimate change in snow depth. All stacks of time-lapse images are fed to our algorithm ( Figure 1) to (1) sort good from poor quality images and create sets of images based on timestamps, (2) create masking of unnecessary area of the images, (3) compute the camera position and orientations, (4) retrieve tie points across all image stacks for coregistration of models, (5) compute 3-D models (point clouds) for each time step, (6) post process the point cloud models, and (7) extract snow depth from rasters. The software consists of a wrapper written in Python 3 using in its core the open-source project Micmac (Rupnik et al., 2017), and the Python libraries Numpy, Pandas, OpenCV, Pyxif, Lxml, and Pdal. Micmac is used in the four crucial steps: Steps (2)-(5).
The image quality (Step 1) is estimated from two metrics, brightness (obtained from the image exif) to discard images from nighttime, and blurriness (variance of the image Laplacian) as defined in Pech-Pacheco et al. (2000) to detect foggy conditions. While Steps (2) and (3) are standard to any photogrammetric project, Step (4) applies the SIFT (Lowe, 2004) to all images at once as outlined by Feurer and Vinatier (2018). This allows to automatically detect tie points across the stack of images and therefore compensates for small movement of the cameras.
Step (5) uses dense multiview stereopsis (Furukawa & Ponce, 2010) to compute point cloud models. All models are then converted to orthorectified images and elevation rasters. From the orthorectified images it is possible to identify snow-covered area from pixel values and therefore categorize snow cover extent for each time step. The RGB orthorectified image is converted to the HSV color space. After equalization of the value band, snow and ground can be categorized based on a simple thresholding. From the elevation models, snow depth evolution can be mapped in all places where images converge and contain enough texture for retrieving depth.
Throughout deployment, there are few steps to carefully follow for optimizing results. First, it is important to plan and scout a scene with a concave shape where the cameras can be installed as normal as possible to  the region of interest (ROI). The presence of fixed landmarks (e.g., large rocks) distributed over the entire ROI is necessary for a good registration of the final models. Second, the cameras must be fixed during the time span of the experiment allowing minimal movements. The angle between two cameras and the ROI should be comprised in between 10 • and 20 • , with a common central point to maximize overlap across the different viewpoints. For snow, images should be underexposed (e.g., −1 EV) to maintain contrast in bright areas. The aperture should be set in the midrange to maximize the depth of field and sharpness, and the ISO set in between 100 and 400 depending on the noise level of the camera sensor. The camera timestamp should be synchronized, and the shutter release should not be more than 25% of the time interval apart from each other. Landmarks can be surveyed with a differential GPS (dGPS) or a total station for referencing the models to an absolute coordinate system. The final signal error will depend on the scene geometry, the number of viewpoints, the camera systems, and the possibility to coregister models with one another with stable and lasting landmarks. Finally, the software should consider all cameras as independent entities since the internal orientation (calibration) will vary even if all camera-lens systems are the same.

An Example Case: Monitoring Snowmelt in Finse, Norway
The site of Finse in Norway (60.58 • N, 7.48 • E) has ideal conditions to test such system ( Figure 2). The average maximum snow accumulation is about 2-m deep with great spatial variability (0->5 m) due to high wind speed and a rugged topography. The landscape is covered by large erratic boulders that are visible even at maximum snow accumulation and which can be used for stable camera installation (Figure 2c) as well as GCPs when located within the images field of view (Figure 2d). We equipped the site with three camera systems converging on a large area of 300 m × 400 m with snow drifts melting over a long period (>1.5 months) in front of Middalsbreen glacier. We used Sony ILCE-QX1 cameras (APS-C sensor, 20M pixels) with Sony 20-mm pancake lens in a box protected by a standard ultraviolet filter. The cameras are controlled by custom-made intervalometers triggering hourly from a real-time clock (DS1337C) signal (1.7-s drift per day at 5 • C). We collected data over two seasons. During the first melt season of June and July 2017, we tested the experimental setup to find the optimal configuration and estimated the accuracy of measurements of this method. We collected images irregularly during this season and only present hereafter a direct comparison of the model from 12 July to a dGPS track of the same day. However, during the second melt season from 10 May to 12 June 2018, the cameras continuously recorded hourly images from which we selected midday images for a total of 34 sets to extract snow depth change and snow cover duration.
Over this 34-day period, the landscape had undergone significant changes. We manually picked seven GCPs on one set of images defined as reference (27 May), and kept 17 for an independent error assessment. The software then identified automatically tie points in all other images in reference to this date (Step 4) to coregister all models in a single coordinate system. We found an average of 1,019 ± 523 tie points common to each image pairs. Each computed point cloud is then cropped to the ROI. These point clouds are processed with a morphological filter (Pingel et al., 2013) as part of the Python library PDAL (PDAL, 2018) to remove . Error assessment of photogrammetric models in respect to independent dGPS measurements and snow depth measurements. (a) The 938 snow depth time series extracted from 10 May to 12 June 2018 daily digital elevation models; in blue the median snow melt, in dashed blue a linear regression fitted to the median curve, and in yellow data from a sonic ranger measuring snow depth 3.8 km away, at the same elevation. Data from the sonic ranger are only available through mid-May, and after 14 June. Original photo from 13 May and 9 June are shown to illustrate light conditions. The vertical shift indicated in blue in the lower panel corresponds to the vertical correction applied using common bare ground areas only in between the last and any given time step. (b) GCPs absolute position deviations from the 2018 models over an intensity map from 10 May (dark regions corresponding to ground, and gray and white to snow). The disk color corresponds to the averaged median difference of height in between the absolute GCP locations and the corresponding models over the 34-day period. The disk size is the corresponding averaged standard deviation of the same metric. Disk with a black outline (6,18,30,34) are reference GCPs for model registration. Coordinates are given in meters UTM 32 • N. (c) Scatter plot of elevation measured with dGPS and from the photogrammetric models over snow-covered areas on 12 July 2017. The solid black line indicates the statistical linear regression, and the dashed black line shows the 1:1 line. GCPs = ground control points; dGPS = differential GPS. noise and extract valid points describing the topography. A raster is then derived from these filtered point clouds in which each pixel represents a square column of 0.5 m, counting on average nine points per pixels with snow ( Figure A3). From each pixel, minimum, mean, maximum, and standard deviation elevations, as well as the number of points are computed over the entire ROI. By only selecting the mean elevation for pixels classified as ground, we estimated a vertical absolute shift to correct for (Figure 4b). Then, from the mean elevation band, we retrieve snow depth by differentiating each model to the last one on 12 June (Figures 3 and A1). A median filter with a kernel of 5 × 5 pixels was applied to the final result to reduce the high spatial frequency noise.
To evaluate the accuracy of the method, we first compared the total mean melt over the snow-covered area to a nearby snow depth sensor (Figure 4a) located 3.8 km away at the same elevation than the ROI. The snow depth record shows a total melt of 1.97 m from 10 May to 14 June. Figure 4a shows the change in snow surface elevation for 938 different pixels randomly picked in areas where snow lasted more than 25 days during the 34-day period (Figures 4a and A2). Each time series represented by a gray line exhibits noisy Water Resources Research 10.1029/2018WR024530 signals due to the high noise generated during the point cloud computation. However, when aggregated to a median value, we observe a decrease of the snow surface of 69 mm/day, corresponding to 2.12 m of snow melted in 34 days.
The colored disks in Figure 4b indicate the median bias of each GCP by their colors, and the relative accuracy of the model in respect to the GCPs dGPS measurements by their sizes. Within the ROI of Figure 4b, four GCPs were used for the absolute registration of the models, leaving therefore 17 of them to evaluate the models' accuracy. We find an average vertical bias of 0.55 m across all GCPs and a mean standard deviation of 0.34 m. After propagating the error corresponding to the differentiation of two surfaces ( √ (0.34 2 + 0.34 2 ) = 0.48), snow depth can be estimated with an error of ±0.48 m. From the 12 July 2017 model, for which we have a track of dGPS measurements on the snow surface, we find a general vertical bias of 0.53 m and a vertical residual standard deviation of 0.43 m (Figure 4c). Both error estimates of the vertical positioning of the reconstructed models are close but given that the 12 July 2017 model was computed independently to others, we consider the first error estimate (for the 2018 season), taking advantage of the SIFT algorithm applied to an entire stack of images, to be more representative of what this method can accomplish with this experimental setup. Therefore, in this example case, we can estimate a snow melt of 2.12 ± 0.48 m.

Discussion
While this technique proves promising for recording geomorphologic process like snowmelt, it comes with limitations. First, the noise associated to the point cloud reconstruction, leading to the observed model vertical positioning errors relative to the reference model (±0.34 m) cannot be decreased unless more simultaneous viewpoint images are provided for the reconstruction. Increasing the number of cameras would also decrease the number of artifacts in the resulting point clouds. For instance, inconsistencies of the point cloud reconstruction visible in Figure 3 on May 28 or coregistrations misalignment highlighted by the wide time series spreads on 13 May or 9 June in Figure 4a could be addressed with additional information of the scene. The brushed patterns aligned with the central camera direction (NE-SW) visible in Figure 3 can also be assigned to too few view points and too narrow angles in between the cameras during the depth map reconstruction. An increased number of cameras spread over a larger arc around the ROI should also significantly improve the models.
The long distance but also the shallow angle between the cameras and the ROI lead to higher noise levels and even impossible reconstruction, despite the overall concave geometry of the experimental setup (see the lack of information in the upper right and lower left corners of the ROI). A closer converging focal point for the cameras and therefore a closer ROI (smaller geometry) could have produced more accurate 3-D models, though this will require further investigation and will be constrained by the overall terrain geometry. Therefore, this example setup with only three cameras located 1.2 km away from the ROI reaches the limit of what this technique can resolve for evaluating a reasonable snowmelt signal (2.12 ± 0.48 m). If scaled down, an analogous experimental setup should theoretically produce better results and be more adapted to shallow (≤1-m deep) snowpacks.
Studies using direct readings of snow depth on a stake within the field of view of a time-lapse camera exhibit very high-accuracy snow depth estimate with root-mean-square error of few centimeters (Dong & Menzel, 2017;Garvelmann et al., 2013), our method exhibits errors closer to studies estimating snow depth from aerial photogrammetry (±0.30 m; Bühler et al., 2015;Nolan et al., 2015). Our method uses a similar approach, using photogrammetry to retrieve snow depth as a derived product, with fewer images taken further away from the object of interest.
This technique requires texture in the images for an optimal point cloud reconstruction. Sun cups typically present at the snow surface during melt create such texture, but if, for instance, the snow surface had gotten covered by fresh snow, the sudden loss of texture would have prevented the reconstruction of 3-D models over the snow areas until this same layer of fresh snow exhibits sharp textures again. During other time periods, snow bedforms could provide sufficient texture for this method to perform. Here we choose a location with little to no vegetation as the complexity of developed canopy structures (e.g., shrubs, trees) can fail the photogrammetric reconstruction as well.
The software is at its first version, and many improvements to the procedure described herein can be accomplished to increase automation, improve 3-D models quality, and reduce coregistration errors of 3-D models.

10.1029/2018WR024530
The high uncertainty for the 2017 model (0.43 m) in comparison to the 2018 one (0.34 m) could be attributed to poor camera calibration for the 2017 image triplet. This is most likely due to the few tie points identified between the images. This has been compensated by the large number of temporally close time steps (and therefore images) in the 2018 data sets. To improve the coregistration of point clouds, we see potential progress in developing alternative techniques to the algorithm SIFT used to automatically detect common features (tie points) in between two images based on other image properties less sensitive to changes in solar illumination and the rapid change of the hard contrast snow versus ground. Further use of postprocessing techniques like the one developed by Kromer et al. (2016) could significantly improve the results by taking advantage of redundant information in the point clouds in both space and time if originally included in the experimental design. Finally, because this technique relies on the dynamic range of the cameras, using cameras with deeper spectral resolution (i.e., 16 bit vs. 8 bit) would also bring improvements to the final result.
As described previously, many parameters can influence uncertainty and quality of the results. In the case study presented above we demonstrated the potential of this new method but fortuitously reached the limit of what this method can accomplish in resolving snow depth change. While software improvements can bring further reduction of uncertainties, future studies should also focus on the influence of the experimental setup on the sources of uncertainties. For instance, one could quantify the effect of the experimental size, the number of cameras, and the layout of cameras in respect to the ROI on uncertainties to measuring surface deformation.

Conclusion
Characterizing snowmelt simultaneously in time and space is still a challenge. Many studies have proposed the use of ground-based time-lapse imagery to extract useful metrics (i.e., snow cover extent) during melt, sometimes combined with numerical melt models. We propose an alternative method based on consumer grade cameras and an open-source software to derive directly and automatically from time-lapse imagery a time series of 3-D models. These models can be analyzed to extract snow depth change, snow cover extent, snow cover duration, and spatial melt rates. Tested in Finse, Norway, we show from this setup the potential and some of the limitations of the technique. The melt rates obtained are comparable to nearby direct measurements of snowmelt.
The main value of this new algorithm is the automation of tasks that would otherwise be time consuming. By implementing steps sorting good from poor quality images, and detecting automatically tie points across a stack of time-lapse images, the user inputs are considerably reduced. The current version allows for compensation of slight movement of the cameras potentially arising from thermal deformation or any other sources inducing a movement of the camera. In our case, cameras were mounted on large erratic blocks still subject to differential settlement during thawing of the ground. Further improvements of the algorithm are possible and could focus on the point cloud post processing methods.
Overall, we were able to demonstrate that this relatively low-cost solution could be • used to monitor spatial changes in snow depth and snow cover extent and duration during the melt period.
With cameras located 1.2 km away from the ROI, it could resolve an accuracy of 0.48 m and detect a median snowmelt of 2.12 m from 10 May until 12 June. • automated by taking advantage of landmarks in the field of view, as well as image processing algorithms to correct for potential camera movements. • implemented in a full open-source solution using Micmac at its core and a variety of Python libraries for handling images, the detection of tie points, and the processing of point clouds.
Here applied to snowmelt, this technique could as well be applied to other geomorphologic processes like snow accumulation, snow bedform formation, glacier dynamic, permafrost creep, or any process where visible geometries are deforming. To succeed, one should consider the spatial and temporal scale of the process and ensure that the coregistration of point clouds is feasible from either absolutely fixed cameras or stable landmarks in the region of interest.
The method developed above allows for spatially and temporally detailed products such as daily snow depth maps ( Figure A1) or snow cover duration maps ( Figure A2). Similarly, it is also possible to extract detailed metaproducts of the methods like the number of points per pixel used to derive a raster from the original point cloud ( Figure A3). Figure A1. Snow depth maps for each triplet of images during the 34-day period. Snow depths are differentiated in reference to the last day, 12 June. Pixels with no values or classified as ground are transparent. The maps' spatial extent, orientation, and resolution are the same as in Figure 3.